Fantoir-datasource: Difference between revisions
(Created page with "The <code>fantoir-datasource</code> tool allows to import and manage FANTOIR datasource. == Usage == <source> $ fantoir-datasource help Import FANTOIR database into PostgreSQL Usage: fantoir-datasource <COMMAND> Commands: fetch Fetch the last version of the FANTOIR file import Import from FANTOIR file generated by the DGFIP promote Promote an imported FANTOIR table as the current FANTOIR table to use wikidata Query Wikidata SPARQL end-point to enri...") |
No edit summary |
||
Line 1: | Line 1: | ||
The <code>fantoir-datasource</code> tool allows to import and manage FANTOIR datasource. | The <code>fantoir-datasource</code> tool allows to import and manage FANTOIR datasource. | ||
== Database setup == | |||
The tool uses a PostgreSQL database to store FANTOIR, Wikidata and configuration data. | |||
== Recommended workflow == | |||
To update correctly the PostgreSQL database, it's recommended to follow those steps: | |||
# '''fetch''' operation to get the last FANTOIR file and extract it ; | |||
# '''import''' operation to read it and insert data into PostgreSQL, use with <code>-c</code> option so it takes care to create the table, specify a NEW table so we don't interrupt any service during update process ; | |||
# '''wikidata''' operation to import rich metadata from Wikidata table, use with <code>-tc</code> option to truncate/create table as needed ; | |||
# '''query''' operation to test the new imported dataset with a known value ; | |||
# '''promote''' operation to update foreign keys and config to point to that new table. | |||
It's possible to create a maintenance report for Wikidata during that import, using <code>fantoir-datasource wikidata -tc --maintenance-report</code>. | |||
== Usage == | == Usage == | ||
Most commands require DATABASE_URL environment variable to interact with PostgreSQL. The <code>fetch</code> command can be used without it. | |||
<source> | <source> |
Revision as of 06:20, 15 January 2023
The fantoir-datasource
tool allows to import and manage FANTOIR datasource.
Database setup
The tool uses a PostgreSQL database to store FANTOIR, Wikidata and configuration data.
Recommended workflow
To update correctly the PostgreSQL database, it's recommended to follow those steps:
- fetch operation to get the last FANTOIR file and extract it ;
- import operation to read it and insert data into PostgreSQL, use with
-c
option so it takes care to create the table, specify a NEW table so we don't interrupt any service during update process ; - wikidata operation to import rich metadata from Wikidata table, use with
-tc
option to truncate/create table as needed ; - query operation to test the new imported dataset with a known value ;
- promote operation to update foreign keys and config to point to that new table.
It's possible to create a maintenance report for Wikidata during that import, using fantoir-datasource wikidata -tc --maintenance-report
.
Usage
Most commands require DATABASE_URL environment variable to interact with PostgreSQL. The fetch
command can be used without it.
$ fantoir-datasource help
Import FANTOIR database into PostgreSQL
Usage: fantoir-datasource <COMMAND>
Commands:
fetch Fetch the last version of the FANTOIR file
import Import from FANTOIR file generated by the DGFIP
promote Promote an imported FANTOIR table as the current FANTOIR table to use
wikidata Query Wikidata SPARQL end-point to enrich FANTOIR information
query Query the imported FANTOIR table
help Print this message or the help of the given subcommand(s)
Fetch
Check through the data.economie.gouv.fr API the last version available, compare with FANTOIRmmyy file present in working directory (if any). Download it, unzip it if needed.
To force overwrite the file, use fantoir-datasource fetch --overwrite
.
Exit code can be used to programmatically know what happened:
Code | Description |
---|---|
0 | Fetch operation succeeded |
2 | Parsing arguments issue (not fetch specific) |
4 | FANTOIR file already exist and --overwrite not specified |
8 | Download issue, at filesystem or HTTP level, stderr will provide information for humans or store in a log |
n | Exit code from unzip command (man 1 unzip) |
127 | Error with the unzip process, without knowing the unzip exit code |
If the file is extracted and all is fine, the script will print on stdout:
FANTOIR_FILE=FANTOIR1022