FactoidPluginRewrite

From UbuntuBots

Jump to: navigation, search
Note: none of this stuff is written in stone, this my idea of things to consider
for make a nice factoid plugin. It mostly comes from my experience with Encyclopedia 
and a written-from-scratch factoid plugin that I use in Spanish channels. 
Feel free to add and discuss ideas, here or in #ubuntu-bots-devel

Contents

Blueprint

A general overview of what we want to achieve with the rewrite.

Abstraction

Brief

We need to use an abstracted API for databases, firstly so we aren't tied to a specific database type and secondly so we can avoid using raw SQL commands where possible.

Details

Options like Storm may look good for give a try. There's a junk branch called Stormyfacts, see #Stormyfacts

Global/Local factoids

Brief

Normally a factoid is global, meaning is available in all channels the bot is in. A local factoid can override a global factoid in the channel it was defined.

Details

Global factoids are stored as "foo" while local factoids as "foo#channel"

  • Calling for "foo" in #channel would check for "foo#channel" and "foo" factoids.
  • Calling for "foo#channel" would check for "foo#channel" only
  • Calling for "foo#" would check for "foo" only

Factoid revision control

Brief

When edits are made to factoids the changes are stored in the database, currently these changes are not used for anything. It would be nice to make it so that these revisions can be retrieved and reverted.

Details

The changes log would store information such as the old factoid value, who changed it, when, and a revision number. We could use this information to allow different revisions of a factoid to be logged, retrieved and possibly reverted. See History.

Support Localization

Brief

The strings in the plugin and the factoids themselves should have the ability to be localized, the default being English. This would allow LoCo channels to translate the plugin and create a localized version of the factoids database.

Details

We could use a system such as gettext to allow the strings in the plugin to be translated. For the factoid database we could have multiple databases, like for global/local factoids. See Localization.

Web interface

Brief

The web interface would allow us to do everything we currently do over IRC, but from a browser. Such as adding/editing/deleting factoids and moderating factoid addition/edit requests.

Details

This would likely be done as part of the Bantracker2 (issue tracker) project, but we need to interoperate with it so we need to consult.

Flood control

Brief

We need some protection from flooding the bot with requests for factoids and for making sure the bot does not flood channels with replies.

Details

This should include having a time limit on the bot "repeating", as is currently already done. We should also thing about stopping people from being able to flood by giving several factoids requests in a short time. Currently, there is no protection from that.

Recursive factoids

Brief

When we make factoids it is often useful to use some sort of place-holder so we don't need so many channel-specific factoids. These place-holders would retrieve the content of other factoids and insert them in-place.

Details

For instance we currently use $chan to represent the channel the factoid was called in, and several $cur* variables for current releases. All of these, except $chan, are stored as configuration variables. We should consider replacing this system with the ability to retrieve other factoids and insert them into other factoids. This may take the for of "LTS means Long Term Support. LTS versions of Ubuntu will be supported for 3 years on the desktop, and 5 years on the server. The current LTS is %{current_lts}", where !current_lts would reply "Ubuntu 10.04 (!Lucid Lynx 10.04)"

Categorized factoids

Brief

It would be good if we could add a "category" factoid type, for instance a specialized factoid for "support" channels, or for "development" channels or "offtopic" channels.

Details

We would have a channel specific config variable, eg: category, which will default to "general". Or have a command to create a new config group and add channels to it, similar to the way Bugtracker adds trackers. When looking up a factoid the bot will use this category to determine which factoid to display. This would allow, for instance, the !offtopic factoid to be more specialized without having to create one for each channel. So for "general" channels we would have !offtopic set to "Please take general chat to ...", and "support" to "$chan is a support only channel, please use ... for general chat".

User management

Brief

It would be great if we could give a group of people the ability to edit/create their own channel factoids, without allowing them to touch other channel/global factoids. That way we could let LoCo teams manage their own local-language factoids.

Details

I'm not sure how we'll do this, maybe use channel capabilities? or somehow make virtual groups in supybot, but I would really like to see this happen. We could use Launchpad here.

LP Integration

Brief

Users and certain "people listing" factoids should be managed/generated from Launchpad groups.

Details

We should use the API to generally manage access rights and to generate certain "user list" factoids, such as !ops. We should provide a general way to create factoids with lists of nicks, such as !factoid is Calling all ops ${LPNicks:irc-${channel}-ops}. This would require the Recursive factoids to be sub-recursive and pre-process "special" tags too. The bot should verify the factoid before attempting to add it to the database, ensuring no infinite recursion and existence of "tags" and factoids.

database type

As discussed in #ubuntu-bots-devel we would need to support mora than one db type for several workflows:

using just MySQL
suitable for ubottu
using just SQLite
suitable for a single bot with its own factoid db
using SQLite and syncing from a MySQL db
suitable for bots not in ubottu's server but that use its factoids, like ubot* bots

SQLite

Good

  • is in python standard lib (sqlite3) and doesn't need a server
  • Database files can be split up and attached to a connection.
* May improve speed for large databases with multiple tables

Bad

  • locking issues, only reliable for one client (just the bot)
  • sqlite seems to be tsimpson's nemesis.

MySQL

Good

  • multi clients (other bots and web interface)
  • faster that sqlite (for large bans database)

Bad

  • bot must be in the same server as the database, or there will be lags
  • when fetching data from the db, it returns a tuple instead of a nice Row object.

PostgreSQL

I don't know, I never used it.

Bad

  • It seems each "connection" starts a new postgres process, probably won't be a problem for us though

table structure

factoids

Where the factoid themselves are stored.

name TEXT PRIMARY KEY Name of the fact, must be unique.
created_by INTEGER id of the user that created it (id from user table)
created_at DATETIME when was the fact created
edited_by INTEGER id of the user that last edited it (id from user table)
edited_at DATETIME when it was last edited
modes TEXT fact modes: list of "modes" that apply to the fact, see below
requests INTEGER times the factoid was requested in a public channel
alias TEXT if is an alias, name of the factoid it points
reply TEXT Factoid reply

Wouldn't it be enough for aliases to have the reply ${other_fact}? Tsimpson 12:43, 6 June 2010 (UTC)

  • Yeah, is feasible, I originally didn't want to save the alias in the reply column because it would add a new record in the history table but I guess storing alias changes will have its merits as well. M4v 16:40, 29 June 2010 (UTC)

factoid modes

This would replace the use of <tags> for mark factoids, such as <deleted>, this is because the factoid history would record any changes in the factoid reply, and these tags aren't important to record, they are mostly flags that are either there or aren't.

The factoid mode format is a comma separated list of modes, e.g., deleted,locked

Some factoid modes

deleted
this fact is deleted and unavailable
locked
this fact is locked and can't be edited.
alias
this fact is an alias, argument should indicate to which fact it should point to.
moved this to its own column, using modes with arguments makes things more tricky than what it's worth.
alert
this fact would alert in #alertChannel whenever is called, such as !ops. #alertChannel could be defined in a config option.
private
the reply of this fact would be send in private, no matter what. I though this mostly for informative facts that could flood or cause unnecessary highlight. E.g., if you create a !whoisop factoid, that has the same contents of !ops, but where it isn't intended for alert anyone, but for give a list of operators.

other

  • the <reply> tag should go away, instead, store in the database the exact factoid reply. This is simpler.
  • datetime should be stored in some format supported by the database, like in unixepoch or "YYYY-MM-DD HH:MM:SS". Don't use freaky stuff like pickled datetime objects or some string that makes date handling and sorting difficult.

History

Tables where the revision of each factoid is stored.

name TEXT factoid name
revision INTEGER revision number, 0 would be the original factoid's reply
edited_by INTEGER previous editor
edited_at DATETIME previous edit time
reply TEXT previous reply

users

Using a user table would save db space, since it wouldn't be needed to add the full hostmask string in every factoid, just the id number.

id INTEGER PRIMARY KEY User id, unique
hostmask TEXT User hostmask. Each hostmask should have a single id.
first_seen DATETIME first time user edited the db
last_seen DATETIME last time user edited the db
factoids INTEGER total factoids created by user (for statistics)
edits INTEGER total of edits done by user (for statistics)

Just one user table should be used.

history

Everytime a factoid's value (the factoid's reply) is edited it should be stored in a separate table, this way changes can be reverted easily, in cases of mistakes or vandalism.

!undo foo
revert foo to its previous reply
!redo foo
revert a previous undo action
!foo --rev id
reply using an old revision, this revision being equal to last_rev + id + 1. So -1 would show foo's previous value (last_rev), -2 would show the next. Positive values would cause revision to wrap and be the actual revision numbers, e.g., !foo --rev 0 should be the original !foo when it was created.

Alias

Alias support should be more robust, (it shouldn't be possible to create recursive aliases), and an alias should behave like the factoid itself, e.g., editing an alias should edit the contents of the factoid the alias is pointing to. This way is less confusing and more transparent.

localization

commands

Several commands are locale dependant, such as "foo is bar", or "tell user about foo". This could be done by using regexps, given a config option (the language) a different set of regexps is used. These could even be channel specific, i.e., in #ubuntu English regexps are used, while in #ubuntu-es it would use Spanish ones.

This is good as long as there are few commands that need translated regexps.

strings

This should be string that the plugin would reply, most likely an error or informative message, I haven't studied how to do this though, if using gettext or some other alternative.

factoids

Translating factoids isn't easy, is not just a matter of translating strings but finding links for replace English wikis (or even translating them in another wiki). This should be done by dedicated users using the bot. This is a way I'm currently implementing in Spanish channels for facilitate this, on IRC, e.i., without a web interface or something like that.

I thinking in adding a sort of cross-database support, where each bot (or channel) has a main db, and a fallback db. The main db would store the translated factoids and would be editable, the fallback db would be ubottu's factoids and would be read only. When anyone requests a factoid, the plugin would look in the main db, if it isn't there, it will try searchin in the fallback db if there's one defined. This would mean that with a newly deployed bot, all factoids would be in English until the users there create the equivalent factoids in their language. Factoid names would be translated by simply creating an alias. This can be exploited for other channels like -offtopic, they can have their own db file, and use ubottu's factoids as a fallback, so they can create their own "fun" factoids without touching support factoids, and yet have them available.

erUSUL pointed me out that having people call factoids that might result being translated or not might confuse users, so in these cases, where using a fallback db might not be desirable, using a special argument for retrieve English factoids would be better, so a translator can still check an english factoid with "!foo --db en", this isn't too different than opening a query with ubottu though, but it allows to call English factoids in the channel directly.

Any thoughts? I'll have this implemented in our bot in #ubuntu-es soon, and see how it goes.

unicode

Unicode objects must be used for the name and reply of a factoid, this is for support the special chars other languages use, like ñ in Spanish. This should be the same for any string used in a reply and factoid modes, since an alias argument may have unicode chars.

web interface

TODO not sure about this, I think we want to use django, *pokes Pici*

protection

edit protection

TODO describe how edits should be restricted

repeat protection

TODO describe protection against calling a factoid or command too often

syntax

Creating a factoid

Since the <reply> tag shouldn't be used anymore, I want to suggest using "foo: bar" in replace of "foo is <reply> bar"

Calling global or local factoids

syntax for call factoids

!foo
any factoid, look first for a local factoid in #channel, if it doesn't exist, look for a global one.
!foo#channel
local factoid, look for a local factoid in #channel, if it doesn't exist, display error.
!foo#
global factoid, look for a global factoid, regardless if exists a local factoid, display error if it doesn't exist.

I did like to drop the foo-#channel syntax of Encyclopedia, i think foo# and foo#channel are nicer than foo-# and foo-#channel

Forbid factoids with spaces

This is something I was wondering, should we support factoids with spaces in their names? They're hardly used, and though is a nice feature, forbidding spaces would make parsing a lot simpler, and we could implement factoids calls like:

!foo nick: please look this

instead of

!foo | nick: please look this

I don't care either way, I'm keen to keep spaces, but I wouldn't mind if they go away.

replaces

Factoids can contain several keywords that would be replaced when calling it.

keyword replaces

These are just the simple $keyword replaces, such as $channel, $nick, $botnick. Dynamic !ops factoids would use any of these replacers

${channel_ops_lp}
this would be replaced by a list of channel operators, as defined in the launchpad group, this needs of plugin AccessManager to be finished
${channel_ops}
same as the one above, but ops list should be fetched from ChanServ. For channels that doesn't have it's ops as members of some lp group.

We should discuss if the dynamic list should be generated with the ops present in the channel, or with all of them.

factoid replaces

This is basically replace a special keyword with the reply of another factoid, this would serve as the custom keywords currently used by Encyclopedia, like $curStable, instead you create a factoid an use it as a replacer. Example:

!curStable is <reply> Ubuntu 10.04
!ubuntu is <reply> Current stable release is %{curstable}

This is much more flexible, and makes possible to not need a bot owner/admin for update them.


testcases

This is very important, by looking at all this you can guess that the plugin will get complicated, testcases are needed for avoid regressions and make the code easier to maintain, it also helps newcomers, they can start hacking away and learn the code without fear of breaking stuff, and makes debugging a lot easier as well.

Most (by most, read all) plugins in ubottu lacks testcases, this is annoying, Python and Supybot makes it ridiculously easy to create testcases, is true that not everything can be tested, but the main features must have a testcase.

People that neglects this should be severely punished!

Some code

Stormyfacts

Using Storm branch

Factos

As it shouldn't come as a surprise, I wrote my own factoid plugin for help in Spanish channels, is called Factos (the word we came up for "factoid") and its devel branch is located here, and a bit outdated README It has most of the concepts I mentioned here implemented, but not exactly in the same way (mostly because I didn't had time to refactor it yet), and because I wrote it from scratch it has some main differences with how Encyclopedia behaves, for example, besides of not allowing factoids with spaces, it uses invalidCommand method for call factoids, as consequence, factoids can't override supybot's own commands, which always annoyed me. Nevertheless, I hope Factos could serve as an inspiration for this new plugin.

database stuff

Factos has a somewhat abstracted api for handle databases. Still it would be good to check other alternatives. anyway I wrote some examples about using Factos objects here: FactosCodeExample

  • I'm currently trying to use storm, I think it would be better. M4v
Personal tools