Download official/archive documents
I saw several offers of (in French) 'coffres forts numériques' i.e. 'digital safe deposit (box?)' (not sure of my translation). Some of then allows one to just setup its credential of various third-party website and then automatically regularly download official/archive documents (taxes from the government, bill from various sites, ...) often proposed as PDF on these sites.
Of course, this is something I do not like at all (these website gain access to most of people credential and we depend on their security measure). However, this make me think it can be a very good new capability for weboob. The most difficult part would probably be to able to identify which documents were already downloaded to avoid to always download all the available documents.
What do you think of this? Are some people interested? Is there some place to discuss such proposal? (mailing list, ...)?
However, this make me think it can be a very good new capability for weboob.
If you mean "capability" in the weboob sense, it already exists with the bill capability (even though it's not the most well-documented) and the boobill command. There are a few weboob modules implementing it.
The most difficult part would probably be to able to identify which documents were already downloaded to avoid to always download all the available documents.
Many weboob objects have an
idfield, which is generally stable when reconnecting to a website, this could be a starting point to know if a file has already been downloaded.
Is there some place to discuss such proposal? (mailing list, ...)?
There's a mailing list or #weboob on freenode irc.
It does not need to be attached to weboob. On other topics than bills and documents, but rather on banking PFMs or housing search, a few projects have grown outside of weboob but still use weboob as their data source library (like kresus or flatisfy for example).
We have already some modules for official website (like ensap, for french servant of State). If you are interested, I will commit a proof of concept script (to run in a cron) to download every new document each day. It works fine for my usage, I download some bills and ensap documents with this solution.
I close this ticket since it was more a question than an issue. Don't hesitate to discuss on the mailing list on IRC channel if you have more questions