Fantoir-datasource: Difference between revisions
(Created page with "The <code>fantoir-datasource</code> tool allows to import and manage FANTOIR datasource. == Usage == <source> $ fantoir-datasource help Import FANTOIR database into PostgreSQL Usage: fantoir-datasource <COMMAND> Commands: fetch Fetch the last version of the FANTOIR file import Import from FANTOIR file generated by the DGFIP promote Promote an imported FANTOIR table as the current FANTOIR table to use wikidata Query Wikidata SPARQL end-point to enri...") |
(→Fetch: Code update) |
||
(3 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
The <code>fantoir-datasource</code> tool allows to import and manage FANTOIR datasource. | The <code>fantoir-datasource</code> tool allows to import and manage FANTOIR datasource. | ||
== Database setup == | |||
[[File:Database schema for fantoir-datasource.png|thumb|480px|Database schema for fantoir-datasource]] | |||
The tool uses a PostgreSQL database to store FANTOIR, Wikidata and configuration data. | |||
== Recommended workflow == | |||
To update correctly the PostgreSQL database, it's recommended to follow those steps: | |||
# '''fetch''' operation to get the last FANTOIR file and extract it ; | |||
# '''import''' operation to read it and insert data into PostgreSQL, use with <code>-c</code> option so it takes care to create the table, specify a NEW table so we don't interrupt any service during update process ; | |||
# '''wikidata''' operation to import rich metadata from Wikidata table, use with <code>-tc</code> option to truncate/create table as needed ; | |||
# '''query''' operation to test the new imported dataset with a known value ; | |||
# '''promote''' operation to update foreign keys and config to point to that new table. | |||
It's possible to create a maintenance report for Wikidata during that import, using <code>fantoir-datasource wikidata -tc --maintenance-report</code>. | |||
{{clear}} | |||
== Usage == | == Usage == | ||
Most commands require DATABASE_URL environment variable to interact with PostgreSQL. The <code>fetch</code> command can be used without it. | |||
<source> | <source> | ||
Line 34: | Line 52: | ||
| 0 || Fetch operation succeeded | | 0 || Fetch operation succeeded | ||
|- | |- | ||
| 2 || Parsing arguments issue (not fetch specific) | | 2 || Parsing arguments issue (not fetch specific), or unzip format issue (fetch specific) | ||
|- | |||
| 12 || FANTOIR file already exist and --overwrite not specified | |||
|- | |- | ||
| | | 16 || Download issue, at filesystem or HTTP level, stderr will provide information for humans or store in a log | ||
|- | |- | ||
| | | 32 || unzip not found, you need to install unzip to decompress the archive | ||
|- | |- | ||
| n || Exit code from unzip command ([https://linux.die.net/man/1/unzip man 1 unzip]) | | n || Exit code from unzip command ([https://linux.die.net/man/1/unzip man 1 unzip]) |
Latest revision as of 01:49, 18 January 2023
The fantoir-datasource
tool allows to import and manage FANTOIR datasource.
Database setup
The tool uses a PostgreSQL database to store FANTOIR, Wikidata and configuration data.
Recommended workflow
To update correctly the PostgreSQL database, it's recommended to follow those steps:
- fetch operation to get the last FANTOIR file and extract it ;
- import operation to read it and insert data into PostgreSQL, use with
-c
option so it takes care to create the table, specify a NEW table so we don't interrupt any service during update process ; - wikidata operation to import rich metadata from Wikidata table, use with
-tc
option to truncate/create table as needed ; - query operation to test the new imported dataset with a known value ;
- promote operation to update foreign keys and config to point to that new table.
It's possible to create a maintenance report for Wikidata during that import, using fantoir-datasource wikidata -tc --maintenance-report
.
Usage
Most commands require DATABASE_URL environment variable to interact with PostgreSQL. The fetch
command can be used without it.
$ fantoir-datasource help
Import FANTOIR database into PostgreSQL
Usage: fantoir-datasource <COMMAND>
Commands:
fetch Fetch the last version of the FANTOIR file
import Import from FANTOIR file generated by the DGFIP
promote Promote an imported FANTOIR table as the current FANTOIR table to use
wikidata Query Wikidata SPARQL end-point to enrich FANTOIR information
query Query the imported FANTOIR table
help Print this message or the help of the given subcommand(s)
Fetch
Check through the data.economie.gouv.fr API the last version available, compare with FANTOIRmmyy file present in working directory (if any). Download it, unzip it if needed.
To force overwrite the file, use fantoir-datasource fetch --overwrite
.
Exit code can be used to programmatically know what happened:
Code | Description |
---|---|
0 | Fetch operation succeeded |
2 | Parsing arguments issue (not fetch specific), or unzip format issue (fetch specific) |
12 | FANTOIR file already exist and --overwrite not specified |
16 | Download issue, at filesystem or HTTP level, stderr will provide information for humans or store in a log |
32 | unzip not found, you need to install unzip to decompress the archive |
n | Exit code from unzip command (man 1 unzip) |
127 | Error with the unzip process, without knowing the unzip exit code |
If the file is extracted and all is fine, the script will print on stdout:
FANTOIR_FILE=FANTOIR1022