apache_beam.runners.interactive.sql.beam_sql_magics module¶
Module of beam_sql cell magic that executes a Beam SQL.
Only works within an IPython kernel.
-
class
apache_beam.runners.interactive.sql.beam_sql_magics.
BeamSqlParser
[source]¶ Bases:
object
A parser to parse beam_sql inputs.
-
parse
(args: List[str]) → Optional[argparse.Namespace][source]¶ Parses a list of string inputs.
- The parsed namespace contains these attributes:
- output_name: Optional[str], the output variable name. verbose: bool, whether to display more details of the magic execution. query: Optional[List[str]], the beam SQL query to execute.
Returns: The parsed args or None if fail to parse.
-
-
apache_beam.runners.interactive.sql.beam_sql_magics.
on_error
(error_msg, *args)[source]¶ Logs the error and the usage example.
-
class
apache_beam.runners.interactive.sql.beam_sql_magics.
BeamSqlMagics
(shell)[source]¶ Bases:
IPython.core.magic.Magics
-
beam_sql
(line: str, cell: Optional[str] = None) → Optional[apache_beam.pvalue.PValue][source]¶ The beam_sql line/cell magic that executes a Beam SQL.
Parameters: - line – the string on the same line after the beam_sql magic.
- cell – everything else in the same notebook cell as a string. If None, beam_sql is used as line magic. Otherwise, cell magic.
Returns None if running into an error or waiting for user input (running on a selected runner remotely), otherwise a PValue as if a SqlTransform is applied.
-
magics
= {'cell': {'beam_sql': 'beam_sql'}, 'line': {'beam_sql': 'beam_sql'}}¶
-
registered
= True¶
-
-
apache_beam.runners.interactive.sql.beam_sql_magics.
collect_data_for_local_run
(query: str, found: Dict[str, apache_beam.pvalue.PCollection])[source]¶
-
apache_beam.runners.interactive.sql.beam_sql_magics.
apply_sql
(query: str, output_name: Optional[str], found: Dict[str, apache_beam.pvalue.PCollection], run: bool = True) → Tuple[str, Union[apache_beam.pvalue.PValue, apache_beam.runners.interactive.sql.sql_chain.SqlNode], apache_beam.runners.interactive.sql.sql_chain.SqlChain][source]¶ Applies a SqlTransform with the given sql and queried PCollections.
Parameters: - query – The SQL query executed in the magic.
- output_name – (optional) The output variable name in __main__ module.
- found – The PCollections with variable names found to be used in the query.
- run – Whether to prepare the SQL pipeline for a local run or not.
Returns: A tuple of values. First str value is the output variable name in __main__ module, auto-generated if not provided. Second value: if run, it’s a PValue; otherwise, a SqlNode tracks the SQL without applying it or executing it. Third value: SqlChain is a chain of SqlNodes that have been applied.
-
apache_beam.runners.interactive.sql.beam_sql_magics.
pcolls_from_streaming_cache
(user_pipeline: apache_beam.pipeline.Pipeline, query_pipeline: apache_beam.pipeline.Pipeline, name_to_pcoll: Dict[str, apache_beam.pvalue.PCollection]) → Dict[str, apache_beam.pvalue.PCollection][source]¶ Reads PCollection cache through the TestStream.
Parameters: - user_pipeline – The beam.Pipeline object defined by the user in the notebook.
- query_pipeline – The beam.Pipeline object built by the magic to execute the SQL query.
- name_to_pcoll – PCollections with variable names used in the SQL query.
Returns: A Dict[str, beam.PCollection], where each PCollection is tagged with their PCollection variable names, read from the cache.
When the user_pipeline has unbounded sources, we force all cache reads to go through the TestStream even if they are bounded sources.