Configuration Options¶
This page documents all configuration options for the Data Lineage Analyzer (DataFlowAnalyzer). Options are organized by which mode they apply to:
- Query mode — Interactive analysis of SQL text pasted into the SQLFlow UI. Settings are toggleable in real time and saved per user.
- Job mode — Batch analysis of uploaded SQL files or database connections in SQLFlow. Settings are fixed at job creation time and read-only afterwards.
- CLI — Command-line usage via
DataFlowAnalyzerdemo class. All options are available via flags. - Java API — Programmatic access via the
DataFlowAnalyzerandOptionclasses.
!!! info "Mode availability" Each option's heading shows which modes support it using badges: Query Job CLI API. Options marked CLI/API only are not exposed in the SQLFlow web UI.
Input Options¶
/f <path> — SQL File¶
:material-tag: CLI | API
Specify a single SQL file to analyze.
1 | |
Java API: Pass a File object to the DataFlowAnalyzer constructor.
/d <path> — SQL Directory¶
:material-tag: CLI | API
Specify a directory containing SQL files. All .sql files in the directory will be analyzed together.
1 | |
!!! warning
The analyzer scans all files in the directory, including non-SQL files (.json, .md, etc.), which may cause parse errors. Use /f with individual files if the directory contains mixed file types.
/t <database_type> — Database Vendor¶
:material-tag: Query | Job | CLI | API
Set the SQL dialect for parsing. Default: oracle.
1 | |
Supported values: access, bigquery, couchbase, dax, db2, gaussdb, greenplum, hana, hive, impala, informix, mdx, mssql, sqlserver, mysql, netezza, odbc, openedge, oracle, postgresql, postgres, redshift, snowflake, sybase, teradata, soql, vertica
Java API: EDbVendor.dbvoracle, EDbVendor.dbvmssql, etc.
SQLFlow: Selected via the database vendor dropdown in both query mode and job creation.
/env <path> — External Metadata¶
:material-tag: Job | CLI | API
Provide a metadata.json file containing database schema metadata (tables, columns, types) to improve column resolution accuracy.
1 | |
Java API: dlineage.setSqlEnv(sqlenv);
SQLFlow Job: Metadata is provided automatically when a database connection is configured as the data source.
/fromdb <connection_params> — Database Connection¶
:material-tag: Job | CLI
Connect directly to a database to extract metadata. Use with /exportonly to only export metadata without analysis.
1 2 | |
SQLFlow Job: Configured via the data source connection UI during job creation.
Output Mode Options¶
These options control the overall structure and format of the output.
/s — Simple Output¶
:material-tag: Query | CLI | API
Produce simplified output that omits intermediate result sets (subqueries, CTEs, derived tables). Shows only the direct lineage between base tables/views and final targets.
1 | |
Java API: Pass true as the third parameter to the constructor: new DataFlowAnalyzer(file, vendor, true)
Effect: When enabled, also automatically enables /showCallRelation (call relationships between procedures).
SQLFlow Query mode: Always uses simple output mode internally.
/i — Ignore Intermediate Result Sets¶
:material-tag: CLI | API
Similar to /s, but preserves result sets generated by SQL functions. Equivalent to /s /topselectlist plus keeping function-generated result sets.
1 | |
Java API: dlineage.setIgnoreRecordSet(true); + dlineage.setSimpleShowFunction(true);
/if — Ignore Function Result Sets¶
:material-tag: CLI | API
Keep all intermediate result sets except those generated by SQL functions.
1 | |
/topselectlist — Show Top Select Results¶
:material-tag: CLI | API
When used with /s, includes the top-level SELECT result set columns in the output.
1 | |
Java API: dlineage.setSimpleShowTopSelectResultSet(true);
/text — Text Format¶
:material-tag: CLI | API
Output column dependencies in a plain text format instead of XML. Only valid with /s.
1 | |
Java API: dlineage.setTextFormat(true);
/json — JSON Format¶
:material-tag: Query | Job | CLI | API
Output in JSON format instead of XML.
1 | |
SQLFlow: Both query mode and job mode use JSON format internally for the visualization frontend.
/graph — JSON with Graph¶
:material-tag: Query | Job | CLI | API
Output JSON format with graph layout information for visualization. Requires /json.
1 | |
/tableLineage — Table-Level Lineage¶
:material-tag: CLI | API
Output table-level lineage instead of column-level lineage. When enabled, /s and /i are automatically disabled.
1 | |
/csv — CSV Format¶
:material-tag: Job | CLI | API
Output lineage in CSV format. Works with both column-level and table-level (/tableLineage) lineage.
1 2 | |
SQLFlow Job: CSV export format can be configured via the csvFormat job property.
/delimiter <char> — CSV Delimiter¶
:material-tag: Job | CLI | API
Specify the delimiter for CSV output. Default: ,.
1 | |
/lv — Lineage for Visualization¶
:material-tag: CLI | API
Output lineage in a format optimized for visualization tools.
1 | |
/o <path> — Output File¶
:material-tag: CLI
Write output to a file instead of stdout.
1 | |
/stat — Statistics Mode¶
:material-tag: CLI | API
Output analysis statistics (timing, counts) instead of lineage data.
1 | |
/version — Version Info¶
:material-tag: CLI
Print the DataFlowAnalyzer version and release date.
1 | |
Lineage Content Options¶
These options control what relationships and objects are included in the lineage output.
Dataflow (fdd)¶
:material-tag: Query | Job
Show direct dataflow (fdd) relationships. Default: true.
SQLFlow Query mode: Toggled via the Dataflow switch in the settings panel. Disabled in job mode (read from job config).
SQLFlow Job mode: Configured via showRelationType field (include fdd).
CLI equivalent: /filterRelationTypes fdd or /simpleShowRelationTypes fdd
Impact (fdr)¶
:material-tag: Query | Job
Show indirect/impact dataflow (fdr) relationships. Default: false.
SQLFlow Query mode: Toggled via the Impact switch. Disabled in job mode.
SQLFlow Job mode: Configured via showRelationType field (include fdr).
CLI equivalent: /filterRelationTypes fdr or /simpleShowRelationTypes fdr
!!! note At least one of Dataflow or Impact must be enabled. The UI enforces this constraint.
/j — Show Join Relations¶
:material-tag: CLI | API
Include JOIN relationships in the output. By default, only data flow (fdd/fdr) relationships are shown.
1 | |
Java API: dlineage.setShowJoin(true);
Relation type: join
/showER — Show Entity Relationships¶
:material-tag: CLI | API
Include entity relationships (foreign key references) in the output.
1 | |
Java API: dlineage.getOption().setShowERDiagram(true);
Relation type: er
/showConstant — Show Constant Tables¶
:material-tag: Query | Job | CLI | API
Include constant/literal values as pseudo-tables in the lineage. Useful for tracking where hardcoded values flow.
1 | |
Java API: dlineage.setShowConstantTable(true);
SQLFlow Query mode: Toggled via the show constant switch.
SQLFlow Job mode: Set at job creation via showConstantTable field.
/showCaseWhenAsDirect — CASE WHEN as Direct Dataflow¶
:material-tag: Query | Job | CLI | API
Treat columns in CASE WHEN conditions as direct dataflow (fdd) instead of indirect (fdr).
1 | |
Java API: dlineage.getOption().setShowCaseWhenAsDirect(true);
Default: false — condition columns are treated as indirect (fdr), only THEN/ELSE result expressions produce direct (fdd) relationships.
Example:
1 2 | |
| Option | label sources (fdd) |
|---|---|
Default (false) |
None (THEN/ELSE are string literals) |
true |
orders.status |
!!! note
This option applies to SQL CASE WHEN expressions parsed as TCaseExpression. It does not currently affect Oracle's DECODE() function, which is handled separately via function.properties. See the DECODE section below for details.
!!! warning "SQLFlow UI: not yet wired"
The underlying Option.setShowCaseWhenAsDirect() API supports all modes, but the SQLFlow web UI does not yet expose a toggle for this option. It needs to be added to: (1) the frontend setting panel as a Switch, (2) the Job model as a showCaseWhenAsDirect field, and (3) the backend controller to pass it through to DataFlowAnalyzer.
/treatArgumentsInCountFunctionAsDirectDataflow — COUNT Arguments as Direct¶
:material-tag: Query | Job | CLI | API
Treat arguments inside COUNT() and other aggregate functions as direct dataflow. By default, COUNT(column) treats the column as indirect since COUNT only produces a count, not the column values.
1 | |
Java API: dlineage.setShowCountTableColumn(true);
SQLFlow Query mode: Toggled via the Dataflow of aggregate function switch (direct / indirect).
SQLFlow Job mode: Set at job creation via treatArgumentsInCountFunctionAsDirectDataflow field.
/transform — Show Transform Code¶
:material-tag: Query | Job | CLI | API
Output the SQL transformation code for each relationship.
1 | |
Java API: dlineage.setTransform(true);
SQLFlow Query mode: Toggled via the show transform switch.
SQLFlow Job mode: Set at job creation via showTransform field.
/traceView — Trace View Sources¶
:material-tag: CLI | API
Output only the names of source tables and views, ignoring all intermediate data. Useful for a high-level view of which base tables feed into which views.
1 | |
Effect: Automatically enables /s (simple mode).
/lof — Link Orphan Columns to First Table¶
:material-tag: Query | Job | CLI | API
When a column reference cannot be resolved to any table (orphan column), link it to the first table in the FROM clause.
1 | |
Java API: dlineage.setLinkOrphanColumnToFirstTable(true);
!!! tip
This is useful when analyzing SQL without external metadata (/env), where some column-to-table mappings are ambiguous.
!!! warning "SQLFlow UI: not yet wired"
The underlying Option.setLinkOrphanColumnToFirstTable() API supports all modes, but the SQLFlow web UI does not yet expose a toggle for this option. It needs to be added to: (1) the frontend setting panel as a Switch, (2) the Job model as a linkOrphanColumnToFirstTable field, and (3) the backend controller to pass it through to DataFlowAnalyzer.
Intermediate Result Set Options¶
These options control whether intermediate result sets (subqueries, functions, CTEs) are shown or hidden.
Show Intermediate Recordset¶
:material-tag: Query | Job | CLI | API
Show intermediate result sets (subqueries, CTEs, derived tables) in the lineage output. Default: off in SQLFlow, on in CLI (unless /s is used).
SQLFlow Query mode: Toggled via the Intermediate recordset switch.
SQLFlow Job mode: Set at job creation via ignoreRecordSet field (true = hidden, false = shown).
CLI: Use /s to hide intermediate result sets, or omit /s to show them.
Java API: dlineage.setIgnoreRecordSet(true);
Show Function¶
:material-tag: Query | Job | CLI | API
Show result sets generated by SQL functions. Default: off in SQLFlow.
SQLFlow Query mode: Toggled via the Function switch.
SQLFlow Job mode: Set at job creation via ignoreFunction field (true = hidden, false = shown).
CLI: Use /if to hide function result sets.
Java API: dlineage.setSimpleShowFunction(true);
/showResultSetTypes <types> — Show Specific Result Set Types¶
:material-tag: CLI | API
Only valid with /s or /i. Specify which intermediate result set types to include. Comma-separated.
1 | |
Java API: dlineage.getOption().showResultSetTypes("cte", "function");
Available types:
| Type | Description |
|---|---|
select_list |
Top-level SELECT list |
array |
Array expressions |
struct |
Struct expressions |
result_of |
Result of expression |
cte |
Common Table Expressions |
insert_select |
INSERT...SELECT intermediate |
update_select |
UPDATE...SELECT intermediate |
merge_update |
MERGE UPDATE clause |
merge_insert |
MERGE INSERT clause |
output |
OUTPUT clause (SQL Server) |
update_set |
UPDATE SET clause |
pivot_table |
PIVOT result |
unpivot_table |
UNPIVOT result |
alias |
Alias expressions |
function |
Function result sets |
case_when |
CASE WHEN result sets |
cursor |
Cursor variables |
variable |
SQL variables |
/removeResultSetTypes <types> — Remove Specific Result Set Types¶
:material-tag: CLI | API
Remove specified result set types from the output. Comma-separated. Same type values as /showResultSetTypes.
1 | |
Filtering Options¶
/filterRelationTypes <types> — Filter Relation Types¶
:material-tag: Query | Job | CLI | API
Only output relationships of the specified types. Comma-separated.
1 | |
Java API: dlineage.getOption().filterRelationTypes("fdd", "fdr");
SQLFlow: Controlled via the Dataflow/Impact toggles (query mode) or showRelationType (job mode).
Available relation types:
| Type | Description |
|---|---|
fdd |
Forward Direct Dataflow — column value flows directly from source to target |
fddi |
Forward Direct Dataflow (Indirect) — column value flows with transformation |
frd |
Forward Reverse Dataflow |
fdr |
Forward Dataflow Reverse — indirect relationship (e.g., used in WHERE, JOIN condition, GROUP BY) |
join |
JOIN relationship between tables |
call |
Procedure/function call relationship |
er |
Entity relationship (foreign key) |
crud |
CRUD operation relationship |
/simpleShowRelationTypes <types> — Simple Output Relation Types¶
:material-tag: CLI | API
Only valid with /s. Filter which relation types appear in simple output. Supports fdd, fdr.
1 | |
Show/Hide Object Options¶
These options control whether specific object types (variables, cursors, synonyms) appear in the output.
/showVariable — Show Variables¶
:material-tag: CLI | API
Show SQL variables in the output. Only valid with /s or /i.
1 | |
Java API: dlineage.getOption().setSimpleShowVariable(true);
/showCursor — Show Cursors¶
:material-tag: CLI | API
Show cursor declarations in the output. Only valid with /s or /i.
1 | |
Java API: dlineage.getOption().setSimpleShowCursor(true);
/showSynonym — Show Synonyms¶
:material-tag: CLI | API
Show synonym definitions in the output. Only valid with /s or /i.
1 | |
Java API: dlineage.getOption().setSimpleShowSynonym(true);
/removeVariable — Remove Variables¶
:material-tag: CLI | API
Remove all variable references from the output.
1 | |
/removeCursor — Remove Cursors¶
:material-tag: CLI | API
Remove all cursor references from the output.
1 | |
/removeUnusedSynonym — Remove Unused Synonyms¶
:material-tag: CLI | API
Remove synonyms that are defined but never referenced.
1 | |
Java API: dlineage.getOption().setIgnoreUnusedSynonym(true); (default: true)
/withTemporaryTable — Include Temporary Tables¶
:material-tag: Query | Job | CLI | API
Include temporary tables in the output. By default, temporary tables are excluded from simple output.
1 | |
Java API: dlineage.setIgnoreTemporaryTable(false);
!!! warning "SQLFlow UI: not yet wired"
The underlying Option.setIgnoreTemporaryTable() API supports all modes, but the SQLFlow web UI does not yet expose a toggle for this option. It needs to be added to: (1) the frontend setting panel as a Switch, (2) the Job model as an ignoreTemporaryTable field, and (3) the backend controller to pass it through to DataFlowAnalyzer.
Schema and Naming Options¶
/defaultDatabase <name> — Default Database¶
:material-tag: Job | CLI | API
Set the default database name for unqualified object references.
1 | |
Java API: dlineage.getOption().setDefaultDatabase("mydb");
SQLFlow Job: Set at job creation via the defaultDatabase field.
/defaultSchema <name> — Default Schema¶
:material-tag: Job | CLI | API
Set the default schema name for unqualified object references.
1 | |
Java API: dlineage.getOption().setDefaultSchema("dbo");
SQLFlow Job: Set at job creation via the defaultSchema field.
/showImplicitSchema — Show Implicit Schema¶
:material-tag: CLI | API
Display the implicit schema prefix for objects that don't have an explicit schema qualifier.
1 | |
Java API: dlineage.setShowImplicitSchema(true);
/n — Normalize Output¶
:material-tag: CLI | API
Normalize table and column names in the output (e.g., consistent casing).
1 | |
Java API: dlineage.getOption().setNormalizeOutput(true);
Coordinate and Transform Options¶
/ic — Ignore Coordinates¶
:material-tag: CLI | API
Omit source code coordinates (line/column numbers) from the output. Produces cleaner output for comparison purposes.
1 | |
Java API: dlineage.setIgnoreCoordinate(true);
/coor — Transform Coordinates Only¶
:material-tag: CLI | API
When used with /transform, output only the coordinates of the transformation code, not the code itself.
1 | |
Java API: dlineage.setTransformCoordinate(true);
Job-Mode-Only Options¶
These options are specific to SQLFlow job mode and control batch processing behavior.
Default Distance¶
:material-tag: Job
Set the maximum depth of lineage traversal in the visualization. Range: 1–10. Default: 10.
SQLFlow Job mode: Configured via the default distance input field. Disabled in query mode.
Show Indirect Clause¶
:material-tag: Job
Control whether indirect clause information (WHERE, JOIN ON, GROUP BY) is shown in the lineage. Default: on.
SQLFlow Job: Set via showIndirectClause field.
Hide Lineage From Stored Procedure¶
:material-tag: Job
When enabled, hides lineage that originates from inside stored procedures, showing only the procedure call relationship. Default: false.
SQLFlow Job: Set via hideLineageFromStoredProcedure field.
Parallel Processing¶
:material-tag: Job
Enable parallel processing of SQL files within a job. Default: false.
SQLFlow Job: Set via parallel field.
Persist Results¶
:material-tag: Job
Persist job results for later retrieval. Default: false.
SQLFlow Job: Set via persist field.
Incremental Analysis¶
:material-tag: Job
Enable incremental analysis — only re-analyze files that have changed since the last job run. Default: false.
SQLFlow Job: Set via incremental field.
Schedule Job¶
:material-tag: Job
Enable scheduled recurring execution using a cron expression.
SQLFlow Job: Set via schedulable field and cronExpression field.
Function Argument Lineage Configuration¶
function.properties Files¶
:material-tag: CLI | API
The function.properties files control how function arguments are classified as direct (fdd) or indirect (fdr) dataflow. These files are located at:
1 | |
Currently, function.properties files exist for:
- Oracle — DECODE function argument mapping
- GaussDB — DECODE function argument mapping (same as Oracle)
- BigQuery — SUBSTR function argument mapping
Format¶
Each line defines a function with its argument classifications:
1 | |
Where each argument type is:
| Type | Meaning |
|---|---|
fdd |
Direct dataflow — the argument value flows into the output |
fdr |
Indirect dataflow — the argument influences but doesn't directly become the output |
| (empty) | No dataflow relationship |
Example: Oracle DECODE¶
1 2 3 4 | |
- Argument 0 (expr):
fdr— the input expression is indirect (used for comparison, not output) - Search arguments:
fdr— comparison values are indirect - Result arguments:
fdd— result values are direct (become the output) - Default argument:
fdd— the default value is direct
!!! info "DECODE and /showCaseWhenAsDirect"
The /showCaseWhenAsDirect option does not currently change DECODE behavior because DECODE is handled as a function call via function.properties, not as a CASE WHEN expression. To change DECODE's first argument from fdr to fdd, modify function.properties directly.
Example: BigQuery SUBSTR¶
1 2 | |
Adding Custom Function Mappings¶
To customize how a function's arguments are treated:
- Create or edit
src/main/resources/gudusoft/gsqlparser/dataflow/<vendor>/function.properties - Add a line for each function overload (different argument counts need separate lines)
- Rebuild the project
Default behavior: If a function is not listed in function.properties, all arguments are treated as direct dataflow (fdd).
Java API-Only Options¶
These options are available through the Java API but do not have corresponding CLI flags or SQLFlow UI controls.
Option.setStartId(long) — Starting ID¶
:material-tag: API
Set the starting ID for generated elements. Useful when merging lineage from multiple analyses.
1 | |
Option.setParallel(int) — Parallelism¶
:material-tag: API
Set the number of threads for parallel analysis. Default: Runtime.availableProcessors() / 2.
1 | |
Option.setTraceSQL(boolean) — Trace SQL¶
:material-tag: API
Enable SQL tracing for debugging.
1 | |
Option.setTraceProcedure(boolean) — Trace Procedures¶
:material-tag: API
Enable procedure tracing for debugging.
1 | |
Option.setOutput(boolean) — Enable/Disable Output¶
:material-tag: API
Control whether output is generated. Set to false for programmatic access to the dataflow model without producing XML/JSON.
1 | |
Option.setShowCandidateTable(boolean) — Show Candidate Tables¶
:material-tag: API
Show candidate tables in ambiguous column resolution. Default: true.
1 | |
Option.setIgnoreInsertIntoValues(boolean) — Ignore INSERT VALUES¶
:material-tag: API
Ignore INSERT INTO ... VALUES (literal) statements (no data lineage from literals). Default: true.
1 | |
Option.setIgnoreTopSelect(boolean) — Ignore Top Select¶
:material-tag: API
Ignore top-level SELECT statements.
1 | |
Option.setFilePathDatabase(String) / Option.setFilePathSchema(String)¶
:material-tag: API
Derive database/schema names from the file path structure.
1 2 | |
Option.setAnalyzeMode(AnalyzeMode) — Analysis Mode¶
:material-tag: API
Set the analysis mode for specialized behavior.
1 2 3 | |
| Mode | Description |
|---|---|
dataflow |
Standard data lineage analysis (default) |
crud |
CRUD operation tracking |
dynamic |
Dynamic SQL analysis |
Option.setSqlflowIgnoreFunction(boolean) — SQLFlow Ignore Functions¶
:material-tag: API
Ignore function result sets in SQLFlow integration mode.
1 | |
Quick Reference¶
Mode Availability Summary¶
| Option | Query | Job | CLI | API | Default |
|---|---|---|---|---|---|
Database vendor (/t) |
:material-check: | :material-check: | :material-check: | :material-check: | oracle |
| Dataflow (fdd) toggle | :material-check: | :material-check: | :material-check: | :material-check: | true |
| Impact (fdr) toggle | :material-check: | :material-check: | :material-check: | :material-check: | false |
| Aggregate function as direct | :material-check: | :material-check: | :material-check: | :material-check: | true (SQLFlow) |
| Show intermediate recordset | :material-check: | :material-check: | :material-check: | :material-check: | off (SQLFlow) |
| Show function result sets | :material-check: | :material-check: | :material-check: | :material-check: | off (SQLFlow) |
| Show constant tables | :material-check: | :material-check: | :material-check: | :material-check: | false |
| Show transform code | :material-check: | :material-check: | :material-check: | :material-check: | false |
| Default distance | :material-check: | 10 |
|||
| Default database/schema | :material-check: | :material-check: | :material-check: | — | |
| External metadata | :material-check: | :material-check: | :material-check: | — | |
| CASE WHEN as direct | :material-check:* | :material-check:* | :material-check: | :material-check: | false |
| Show joins | :material-check: | :material-check: | false |
||
| Show ER diagram | :material-check: | :material-check: | false |
||
| Show variables/cursors/synonyms | :material-check: | :material-check: | false |
||
| Ignore coordinates | :material-check: | :material-check: | false |
||
| Normalize output | :material-check: | :material-check: | false |
||
| Link orphan to first table | :material-check:* | :material-check:* | :material-check: | :material-check: | false |
| Include temporary tables | :material-check:* | :material-check:* | :material-check: | :material-check: | false |
| Table-level lineage | :material-check: | :material-check: | false |
||
| Trace view sources | :material-check: | :material-check: | false |
||
| Hide SP lineage | :material-check: | false |
|||
| Parallel processing | :material-check: | :material-check: | false |
||
| Persist / Incremental / Schedule | :material-check: | false |
* Supported by the underlying API but not yet exposed in the SQLFlow web UI.
Common CLI Option Combinations¶
| Use Case | Command |
|---|---|
| Simple column lineage | /s /ic /text |
| Table-level lineage | /tableLineage /ic |
| Full lineage with joins | /j /showER |
| JSON output for visualization | /json /graph |
| CSV export | /s /csv |
| Include CASE WHEN sources | /s /ic /text /showCaseWhenAsDirect |
| Trace base tables only | /traceView /ic |
| Include constants and variables | /s /showConstant /showVariable |
Option Compatibility Matrix¶
| Option | Requires | Incompatible With |
|---|---|---|
/text |
/s |
/json |
/topselectlist |
/s |
— |
/graph |
/json |
/text |
/coor |
/transform |
— |
/showVariable |
/s or /i |
— |
/showCursor |
/s or /i |
— |
/showSynonym |
/s or /i |
— |
/simpleShowRelationTypes |
/s |
— |
/tableLineage |
— | /s, /i (auto-disabled) |
/withTemporaryTable |
/s implied |
— |