data_profiling.lib.base module
- class data_profiling.lib.base.C(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
StrEnum- BLACK_SQUARE = '■'
- CHAR = 'CHAR'
- CLASSPATH = 'CLASSPATH'
- CLASS_NAME = 'class_name'
- CONNECTION_STRING = 'connection_string'
- CSV_EXTENSION = '.csv'
- DATABASE = 'database'
- DATE = 'DATE'
- DECIMAL = 'DECIMAL'
- EXCEL_EXTENSION = '.xlsx'
- FLOAT = 'FLOAT'
- JAR = 'jar'
- JDBC = 'jdbc'
- NUMBER = 'NUMBER'
- PORT_NUMBER = 'port_number'
- SQL_EXTENSION = '.sql'
- VARCHAR = 'VARCHAR'
- class data_profiling.lib.base.Config
Bases:
object- CONFIG_DIR = PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/data-profiling/checkouts/latest/config')
- PRIMARY_CONFIG_FILE = 'config.yaml'
- classmethod get_config(file_name: str = 'config.yaml') dict
Read a configuration file from the configuration file directory :param file_name: file within the configuration directory :return: the configuration corresponding to that file
- class data_profiling.lib.base.Database(host_name: str, port_number: int, database_name: str, user_name: str, password: str, auto_commit: bool = False, **kwargs)
Bases:
objectWrapper around the jaydebeapi module.
- classmethod execute(sql: str, parameters: list = [], cursor: Cursor = None, is_debug: bool = False) Tuple[Cursor, list]
- Wrapper around the Cursor classReturns a tuple containing:1: the cursor with the result set2: a list of the column names in the result set, or an empty list if not a SELECT statement
- Parameters:
sql – the query to be executed
parameters – the parameters to fill the placeholders
cursor – if provided will be used, else will create a new one
is_debug – if True log the query but don’t do anything
- Returns:
a tuple containing:
- classmethod fetch_one_row(sql: str, parameters: list = [], default_value=None) list | str | int
- Run the given query and fetch the first row.If default_value not provided then …If there is only a single element in the select clause the function returns None.If there are multiple elements in the select clause the function to return [None]*the number of elements.
- Parameters:
sql – the query to be executed
parameters – the parameters to fill the placeholders
default_value – if the query does not return any rows, return this.
- Returns:
if the return contains two or more things return them as a list, else return a single item.
- classmethod get_connection() Connection
- class data_profiling.lib.base.Logger(level: [str | int] = None, session: str = None, **kwargs)
Bases:
object- classmethod get_logger() Logger
- record_factory_factory()
Enables us to display a session identifier with each log message.
- classmethod set_level(level: str) None
- data_profiling.lib.base.dedent_sql(s)
Remove leading spaces from all lines of a SQL query. Useful for logging.
- Parameters:
s – query
- Returns:
cleaned-up version of query
- data_profiling.lib.base.get_line_count(file_path: str | Path) int
See https://stackoverflow.com/questions/845058/how-to-get-line-count-of-a-large-file-cheaply-in-python