FactoidPluginRewrite
From UbuntuBots
Note: none of this stuff is written in stone, this my idea of things to consider for make a nice factoid plugin. It mostly comes from my experience with Encyclopedia and a written-from-scratch factoid plugin that I use in Spanish channels. Feel free to add and discuss ideas, here or in #ubuntu-bots-devel
Contents
|
Blueprint
A general overview of what we want to achieve with the rewrite.
Abstraction
Brief
We need to use an abstracted API for databases, firstly so we aren't tied to a specific database type and secondly so we can avoid using raw SQL commands where possible.
Details
Options like Storm may look good for give a try. There's a junk branch called Stormyfacts, see #Stormyfacts
Global/Local factoids
Brief
Normally a factoid is global, meaning is available in all channels the bot is in. A local factoid can override a global factoid in the channel it was defined.
Details
Global factoids are stored as "foo" while local factoids as "foo#channel"
- Calling for "foo" in #channel would check for "foo#channel" and "foo" factoids.
- Calling for "foo#channel" would check for "foo#channel" only
- Calling for "foo#" would check for "foo" only
Factoid revision control
Brief
When edits are made to factoids the changes are stored in the database, currently these changes are not used for anything. It would be nice to make it so that these revisions can be retrieved and reverted.
Details
The changes log would store information such as the old factoid value, who changed it, when, and a revision number. We could use this information to allow different revisions of a factoid to be logged, retrieved and possibly reverted. See History.
Support Localization
Brief
The strings in the plugin and the factoids themselves should have the ability to be localized, the default being English. This would allow LoCo channels to translate the plugin and create a localized version of the factoids database.
Details
We could use a system such as gettext to allow the strings in the plugin to be translated. For the factoid database we could have multiple databases, like for global/local factoids. See Localization.
Web interface
Brief
The web interface would allow us to do everything we currently do over IRC, but from a browser. Such as adding/editing/deleting factoids and moderating factoid addition/edit requests.
Details
This would likely be done as part of the Bantracker2 (issue tracker) project, but we need to interoperate with it so we need to consult.
Flood control
Brief
We need some protection from flooding the bot with requests for factoids and for making sure the bot does not flood channels with replies.
Details
This should include having a time limit on the bot "repeating", as is currently already done. We should also thing about stopping people from being able to flood by giving several factoids requests in a short time. Currently, there is no protection from that.
Recursive factoids
Brief
When we make factoids it is often useful to use some sort of place-holder so we don't need so many channel-specific factoids. These place-holders would retrieve the content of other factoids and insert them in-place.
Details
For instance we currently use $chan to represent the channel the factoid was called in, and several $cur* variables for current releases. All of these, except $chan, are stored as configuration variables. We should consider replacing this system with the ability to retrieve other factoids and insert them into other factoids. This may take the for of "LTS means Long Term Support. LTS versions of Ubuntu will be supported for 3 years on the desktop, and 5 years on the server. The current LTS is %{current_lts}", where !current_lts would reply "Ubuntu 10.04 (!Lucid Lynx 10.04)"
Categorized factoids
Brief
It would be good if we could add a "category" factoid type, for instance a specialized factoid for "support" channels, or for "development" channels or "offtopic" channels.
Details
We would have a channel specific config variable, eg: category, which will default to "general". Or have a command to create a new config group and add channels to it, similar to the way Bugtracker adds trackers. When looking up a factoid the bot will use this category to determine which factoid to display. This would allow, for instance, the !offtopic factoid to be more specialized without having to create one for each channel. So for "general" channels we would have !offtopic set to "Please take general chat to ...", and "support" to "$chan is a support only channel, please use ... for general chat".
User management
Brief
It would be great if we could give a group of people the ability to edit/create their own channel factoids, without allowing them to touch other channel/global factoids. That way we could let LoCo teams manage their own local-language factoids.
Details
I'm not sure how we'll do this, maybe use channel capabilities? or somehow make virtual groups in supybot, but I would really like to see this happen. We could use Launchpad here.
LP Integration
Brief
Users and certain "people listing" factoids should be managed/generated from Launchpad groups.
Details
We should use the API to generally manage access rights and to generate certain "user list" factoids, such as !ops. We should provide a general way to create factoids with lists of nicks, such as !factoid is Calling all ops ${LPNicks:irc-${channel}-ops}. This would require the Recursive factoids to be sub-recursive and pre-process "special" tags too. The bot should verify the factoid before attempting to add it to the database, ensuring no infinite recursion and existence of "tags" and factoids.
database type
As discussed in #ubuntu-bots-devel we would need to support mora than one db type for several workflows:
- using just MySQL
- suitable for ubottu
- using just SQLite
- suitable for a single bot with its own factoid db
- using SQLite and syncing from a MySQL db
- suitable for bots not in ubottu's server but that use its factoids, like ubot* bots
SQLite
Good
- is in python standard lib (sqlite3) and doesn't need a server
- Database files can be split up and attached to a connection.
* May improve speed for large databases with multiple tables
Bad
- locking issues, only reliable for one client (just the bot)
- sqlite seems to be tsimpson's nemesis.
MySQL
Good
- multi clients (other bots and web interface)
- faster that sqlite (for large bans database)
Bad
- bot must be in the same server as the database, or there will be lags
- when fetching data from the db, it returns a tuple instead of a nice Row object.
PostgreSQL
I don't know, I never used it.
Bad
- It seems each "connection" starts a new postgres process, probably won't be a problem for us though
table structure
factoids
Where the factoid themselves are stored.
| name | TEXT PRIMARY KEY | Name of the fact, must be unique. |
| created_by | INTEGER | id of the user that created it (id from user table) |
| created_at | DATETIME | when was the fact created |
| edited_by | INTEGER | id of the user that last edited it (id from user table) |
| edited_at | DATETIME | when it was last edited |
| modes | TEXT | fact modes: list of "modes" that apply to the fact, see below |
| requests | INTEGER | times the factoid was requested in a public channel |
| alias | TEXT | if is an alias, name of the factoid it points |
| reply | TEXT | Factoid reply |
Wouldn't it be enough for aliases to have the reply ${other_fact}? Tsimpson 12:43, 6 June 2010 (UTC)
- Yeah, is feasible, I originally didn't want to save the alias in the reply column because it would add a new record in the history table but I guess storing alias changes will have its merits as well. M4v 16:40, 29 June 2010 (UTC)
factoid modes
This would replace the use of <tags> for mark factoids, such as <deleted>, this is because the factoid history would record any changes in the factoid reply, and these tags aren't important to record, they are mostly flags that are either there or aren't.
The factoid mode format is a comma separated list of modes, e.g., deleted,locked
Some factoid modes
- deleted
- this fact is deleted and unavailable
- locked
- this fact is locked and can't be edited.
- alias
-
this fact is an alias, argument should indicate to which fact it should point to. - moved this to its own column, using modes with arguments makes things more tricky than what it's worth.
- alert
- this fact would alert in #alertChannel whenever is called, such as !ops. #alertChannel could be defined in a config option.
- private
- the reply of this fact would be send in private, no matter what. I though this mostly for informative facts that could flood or cause unnecessary highlight. E.g., if you create a !whoisop factoid, that has the same contents of !ops, but where it isn't intended for alert anyone, but for give a list of operators.
other
- the <reply> tag should go away, instead, store in the database the exact factoid reply. This is simpler.
- datetime should be stored in some format supported by the database, like in unixepoch or "YYYY-MM-DD HH:MM:SS". Don't use freaky stuff like pickled datetime objects or some string that makes date handling and sorting difficult.
History
Tables where the revision of each factoid is stored.
| name | TEXT | factoid name |
| revision | INTEGER | revision number, 0 would be the original factoid's reply |
| edited_by | INTEGER | previous editor |
| edited_at | DATETIME | previous edit time |
| reply | TEXT | previous reply |
users
Using a user table would save db space, since it wouldn't be needed to add the full hostmask string in every factoid, just the id number.
| id | INTEGER PRIMARY KEY | User id, unique |
| hostmask | TEXT | User hostmask. Each hostmask should have a single id. |
| first_seen | DATETIME | first time user edited the db |
| last_seen | DATETIME | last time user edited the db |
| factoids | INTEGER | total factoids created by user (for statistics) |
| edits | INTEGER | total of edits done by user (for statistics) |
Just one user table should be used.
history
Everytime a factoid's value (the factoid's reply) is edited it should be stored in a separate table, this way changes can be reverted easily, in cases of mistakes or vandalism.
- !undo foo
- revert foo to its previous reply
- !redo foo
- revert a previous undo action
- !foo --rev id
- reply using an old revision, this revision being equal to last_rev + id + 1. So -1 would show foo's previous value (last_rev), -2 would show the next. Positive values would cause revision to wrap and be the actual revision numbers, e.g., !foo --rev 0 should be the original !foo when it was created.
Alias
Alias support should be more robust, (it shouldn't be possible to create recursive aliases), and an alias should behave like the factoid itself, e.g., editing an alias should edit the contents of the factoid the alias is pointing to. This way is less confusing and more transparent.
localization
commands
Several commands are locale dependant, such as "foo is bar", or "tell user about foo". This could be done by using regexps, given a config option (the language) a different set of regexps is used. These could even be channel specific, i.e., in #ubuntu English regexps are used, while in #ubuntu-es it would use Spanish ones.
This is good as long as there are few commands that need translated regexps.
strings
This should be string that the plugin would reply, most likely an error or informative message, I haven't studied how to do this though, if using gettext or some other alternative.
factoids
Translating factoids isn't easy, is not just a matter of translating strings but finding links for replace English wikis (or even translating them in another wiki). This should be done by dedicated users using the bot. This is a way I'm currently implementing in Spanish channels for facilitate this, on IRC, e.i., without a web interface or something like that.
I thinking in adding a sort of cross-database support, where each bot (or channel) has a main db, and a fallback db. The main db would store the translated factoids and would be editable, the fallback db would be ubottu's factoids and would be read only. When anyone requests a factoid, the plugin would look in the main db, if it isn't there, it will try searchin in the fallback db if there's one defined. This would mean that with a newly deployed bot, all factoids would be in English until the users there create the equivalent factoids in their language. Factoid names would be translated by simply creating an alias. This can be exploited for other channels like -offtopic, they can have their own db file, and use ubottu's factoids as a fallback, so they can create their own "fun" factoids without touching support factoids, and yet have them available.
erUSUL pointed me out that having people call factoids that might result being translated or not might confuse users, so in these cases, where using a fallback db might not be desirable, using a special argument for retrieve English factoids would be better, so a translator can still check an english factoid with "!foo --db en", this isn't too different than opening a query with ubottu though, but it allows to call English factoids in the channel directly.
Any thoughts? I'll have this implemented in our bot in #ubuntu-es soon, and see how it goes.
unicode
Unicode objects must be used for the name and reply of a factoid, this is for support the special chars other languages use, like ñ in Spanish. This should be the same for any string used in a reply and factoid modes, since an alias argument may have unicode chars.
web interface
TODO not sure about this, I think we want to use django, *pokes Pici*
protection
edit protection
TODO describe how edits should be restricted
repeat protection
TODO describe protection against calling a factoid or command too often
syntax
Creating a factoid
Since the <reply> tag shouldn't be used anymore, I want to suggest using "foo: bar" in replace of "foo is <reply> bar"
Calling global or local factoids
syntax for call factoids
- !foo
- any factoid, look first for a local factoid in #channel, if it doesn't exist, look for a global one.
- !foo#channel
- local factoid, look for a local factoid in #channel, if it doesn't exist, display error.
- !foo#
- global factoid, look for a global factoid, regardless if exists a local factoid, display error if it doesn't exist.
I did like to drop the foo-#channel syntax of Encyclopedia, i think foo# and foo#channel are nicer than foo-# and foo-#channel
Forbid factoids with spaces
This is something I was wondering, should we support factoids with spaces in their names? They're hardly used, and though is a nice feature, forbidding spaces would make parsing a lot simpler, and we could implement factoids calls like:
!foo nick: please look this
instead of
!foo | nick: please look this
I don't care either way, I'm keen to keep spaces, but I wouldn't mind if they go away.
replaces
Factoids can contain several keywords that would be replaced when calling it.
keyword replaces
These are just the simple $keyword replaces, such as $channel, $nick, $botnick. Dynamic !ops factoids would use any of these replacers
- ${channel_ops_lp}
- this would be replaced by a list of channel operators, as defined in the launchpad group, this needs of plugin AccessManager to be finished
- ${channel_ops}
- same as the one above, but ops list should be fetched from ChanServ. For channels that doesn't have it's ops as members of some lp group.
We should discuss if the dynamic list should be generated with the ops present in the channel, or with all of them.
factoid replaces
This is basically replace a special keyword with the reply of another factoid, this would serve as the custom keywords currently used by Encyclopedia, like $curStable, instead you create a factoid an use it as a replacer. Example:
!curStable is <reply> Ubuntu 10.04
!ubuntu is <reply> Current stable release is %{curstable}
This is much more flexible, and makes possible to not need a bot owner/admin for update them.
testcases
This is very important, by looking at all this you can guess that the plugin will get complicated, testcases are needed for avoid regressions and make the code easier to maintain, it also helps newcomers, they can start hacking away and learn the code without fear of breaking stuff, and makes debugging a lot easier as well.
Most (by most, read all) plugins in ubottu lacks testcases, this is annoying, Python and Supybot makes it ridiculously easy to create testcases, is true that not everything can be tested, but the main features must have a testcase.
People that neglects this should be severely punished!
Some code
Stormyfacts
Using Storm branch
Factos
As it shouldn't come as a surprise, I wrote my own factoid plugin for help in Spanish channels, is called Factos (the word we came up for "factoid") and its devel branch is located here, and a bit outdated README It has most of the concepts I mentioned here implemented, but not exactly in the same way (mostly because I didn't had time to refactor it yet), and because I wrote it from scratch it has some main differences with how Encyclopedia behaves, for example, besides of not allowing factoids with spaces, it uses invalidCommand method for call factoids, as consequence, factoids can't override supybot's own commands, which always annoyed me. Nevertheless, I hope Factos could serve as an inspiration for this new plugin.
database stuff
Factos has a somewhat abstracted api for handle databases. Still it would be good to check other alternatives.
anyway I wrote some examples about using Factos objects here: FactosCodeExample
- I'm currently trying to use storm, I think it would be better. M4v
