enrich#
- datarobotx.llm.enrich(question, using, default_cache=True, verbose=False)#
Enrich structured data with completions from an LLM or chain.
Convenience function for usage with pandas.DataFrame.apply():
Caches duplicative enrichment completions
Progress updating
Maps pandas row or column values to format provided question automatically
- ML-oriented default contextual prompts and chains:
Attempts to infer and instruct around an appropriate completion type: numeric, categorical, date, or free-text
Prior completions included in successive prompts to encourage consistency (e.g. date formatting, categorical levels)
Customizable: interoperates with custom langchain Chains, Tools, LLMs
- Parameters:
question (str) – Question to be answered to enrich the dataset. Provided as Python f-string that can be formatted with data from other fields in the dataframe row or column
using (langchain.llms.BaseLLM or langchain.chains.base.Chain) – Langchain abstraction to be used to answer the question; if a custom chain or tool is provided the question will be formatted for each row/column in the DataFrame and then passed as the first argument when calling the chain run() method
default_cache (bool, default = True) – If true, an InMemoryCache will be initialized and used for the lifecycle of the returned function; caching reduces API consumption from duplicative completions
verbose (bool, default = False) – If True, default enrichment LLMChains will be run with verbose output
- Returns:
Function that can be used directly by pandas.DataFrame.apply() to perform the requested enrichment.
- Return type:
Callable
Examples
>>> import pandas as pd >>> import langchain >>> from datarobotx.llm.chains.enrich import enrich >>> llm = langchain.llms.OpenAI(model_name="text-davinci-003") >>> df = pd.read_csv('https://s3.amazonaws.com/datarobot_public_datasets/' + ... '10K_2007_to_2011_Lending_Club_Loans_v2_mod_80.csv') >>> df_test = df[:5].copy(deep=True) >>> df_test['f500_or_gov'] = df_test.apply(enrich('Is "{emp_title}" a Fortune 500 company or ' + ... 'large government organization (Y/N)?', llm), ... axis=1)