class datarobotx.llm.DataDictChain(**kwargs)#

Generate a data dictionary using an LLM.

  • as_json (bool, default = False) – Whether chain output should be returned as a natural language or json str

  • def_feature_chain (LLMChain, optional) – Chain to be used for defining individual features. If not provided, will be initialized with a default chain that prompts and retrieves individual definitions

  • verbose (bool, default = False) – Whether the chain should be run in verbose mode; only applies if the default feature definition chain is being used


>>> import json
>>> import langchain
>>> import os
>>> from datarobotx.llm import DataDictChain
>>> use_case_context = "Predicting hospital readmissions"
>>> dr_project_id = "XXX"
>>> os.environ["OPENAI_API_KEY"] = "XXX"
>>> llm = langchain.llms.OpenAI(model_name="text-davinci-003")
>>> chain = DataDictChain(llm=llm)
>>> outputs = chain(inputs=dict(project_id=dr_project_id, context=use_case_context))

Chain inputs and outputs:


Chain inputs.


Chain outputs.

property input_keys: List[str]#

Chain inputs.


Context of the problem / use case in which a feature definition is being requested


The feature(s) for which a definition is being requested (comma separated)

project_idstr, optional

DataRobot project_id; if provided, EDA data will be retrieved from DR if available and will be used to attempt to improve data dictionary completions

property output_keys: List[str]#

Chain outputs.


Natural language or json string representation of data dictionary depending on how the chain was initialized with parameter ‘as_json’

verbose: bool#

Whether or not run in verbose mode. In verbose mode, some intermediate logs will be printed to the console. Defaults to the global verbose value, accessible via langchain.globals.get_verbose().