This is page 1 of 3. Use http://codebase.md/tmlr-group/CausalCOAT?lines=true&page={x} to view the full context. # Directory Structure ``` ├── .DS_Store ├── AppleGastronome │ ├── .DS_Store │ ├── Apple_Gastronome_AG7_v20240513_colab_annoation_2024_05_16_1422_hk_Mixtral_GPT-3.5-Turbo_iter1.xlsx │ ├── Apple_Gastronome_AG7_v20240513_colab_annoation_2024_05_16_1422_hk_Mixtral_GPT-3.5-Turbo_iter2.xlsx │ ├── Apple_Gastronome_AG7_v20240513_colab_annoation_2024_05_16_1422_hk_Mixtral_GPT-3.5-Turbo_iter3.xlsx │ ├── Apple_Gastronome_AG7_v20240513.ipynb │ ├── Apple_Gastronome_AG7_v20240513.xlsx │ └── Formal_exp_1 AG7_2024_05_16_1422 GPT-3.5-Turbo release.ipynb ├── assets │ ├── coat_map.pdf │ └── paper.pdf ├── coat_real_world_case │ ├── brain_tumor.png │ └── news_n_stock.png ├── lingam_coat_result │ ├── lingam coat gpt3-5.png │ ├── lingam coat gpt4.png │ ├── lingam coat llama-2-70b.png │ ├── lingam coat mistral-med.png │ └── lingam_coat_table.xlsx ├── Neuropathic │ ├── Formal_exp_1_Neuro_RSI symptom_only release.ipynb │ ├── id_name.txt │ ├── neuro_addition_1000_0126.csv │ ├── neuro_R_shoulder_impingement.xlsx │ ├── Neuropathic_Pain_Diagnosis_v20240119.ipynb │ └── SimulatedData_SampleSize100.csv ├── openai_utils.py ├── README.md └── utils.py ``` # Files -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- ```markdown 1 | <h1 align='center'> 2 | Discovery of the Hidden World with Large Language Models 3 | </h1> 4 | 5 | <p align='center'> 6 | <a href="https://arxiv.org/abs/2402.03941"><img src="https://img.shields.io/badge/arXiv-2402.03941-b31b1b.svg" alt="Paper"></a> 7 | <a href="https://neurips.cc/"><img src="https://img.shields.io/badge/Pub-NeurIPS'24-blue" alt="Conf"></a> 8 | <a href="https://causalcoat.github.io/"><img src="https://img.shields.io/badge/website-CausalCOAT-D76364" alt="Slides"></a> 9 | </p> 10 | 11 | 12 | [NeurIPS 2024] Discovery of the Hidden World with Large Language Models 13 | 14 | This repository contains the source codes for reproducing the results of NeurIPS'24 paper: [**Discovery of the Hidden World with Large Language Models**](). 15 | 16 | **Author List**: Chenxi Liu*, Yongqiang Chen*, Tongliang Liu, Mingming Gong, James Cheng, Bo Han, Kun Zhang 17 | 18 | (* Equal Contribution) 19 | 20 | ## Prepare to Use API 21 | 22 | ### Interact with OpenAI 23 | 24 | 1. Go through [the OpenAI Quickstart Tutorial](https://platform.openai.com/docs/quickstart?context=python) 25 | 2. In this project, we `LLM_openai()` to contorl the interaction with OpenAI 26 | ```python 27 | import sys 28 | sys.path.insert(0, '/project/root/path') # set project root path 29 | 30 | from utils import Logger 31 | from openai_utils import LLM_openai 32 | 33 | # check the conversation log at 'logs/{exp_name}' 34 | logger = Logger( 35 | root = '/project/root/path', 36 | exp_name = 'set_exp_name' 37 | ) 38 | 39 | llm = LLM_openai(logger) 40 | 41 | response = llm.chat( 42 | model="gpt-4-1106-preview", 43 | content= "Hello GPT4!", 44 | system_instruct = "You are a helpful assistant." 45 | ) 46 | ``` 47 | 48 | ### Interact with Poe 49 | 50 | 1. Go through [the Poe FastAPI Tutorial](https://creator.poe.com/docs/accessing-other-bots-on-poe) 51 | 2. In this project, we `chat()` to contorl the interaction with Poe 52 | ```python 53 | import fastapi_poe as fp 54 | import asyncio 55 | 56 | async def chat(content, bot_name="GPT-3.5-Turbo" ): 57 | poe_api_key = "your api key" # set your api key 58 | message = fp.ProtocolMessage(role="user", content=content) 59 | responce = "" 60 | async for partial in fp.get_bot_response(messages=[message], bot_name=bot_name, api_key=poe_api_key): 61 | responce = responce + partial.text 62 | 63 | return responce 64 | 65 | response = await chat("Hello world!", bot_name="GPT-3.5-Turbo" ) 66 | ``` 67 | 68 | 69 | ## The AppleGastronome Benchmark 70 | 71 | ### Files 72 | 73 | - **Apple_Gastronome_AG7_v20240513.ipynb** This is the notebook to generate the AppleGastronome Benchmark dataset. "AG7" means there are 7 variables considered in the data-generating process. 74 | - **Apple_Gastronome_AG7_v20240513.xlsx** This is the generated AppleGastronome Benchmark dataset. It consists of 8 columns: 7 variables plus one column for the review texts. 75 | - **Formal_exp_1 AG7_2024_05_16_1422 GPT-3.5-Turbo release.ipynb** This is the notebook to test COAT method with different Foundation Models. 76 | - **Other .xlsx files** Annotations generated by [annotation-2024_05_16_1422_hk_mistralai_GPT-3.5-Turbo release.ipynb](https://colab.research.google.com/drive/1GHUIk631UzD-lI49H_0eTJc9YKJ2lwxe?usp=sharing) 77 | 78 | Conversation Logs: 79 | - **META**: https://poe.com/s/Suv0XqninMg9nQTGFEiy 80 | - **COAT+CoT**: https://poe.com/s/x1tha9wV6opPydhym2sQ 81 | - **DATA**: First round of COAT. 82 | - **COAT**: https://poe.com/s/tYvcQuIyRB54wVpyUZyx 83 | 84 | 85 | ## The Neuropathic Benchmark 86 | 87 | Source of the causal graph: [here](https://observablehq.com/@turuibo/the-complete-causal-graph-of-neuropathic-pain-diagnosis) 88 | 89 | ### Files 90 | 91 | - **SimulatedData_SampleSize100.csv** and **neuro_addition_1000_0126.csv** Generated from [Neuropathic Pain Diagnosis Simulator](https://github.com/TURuibo/Neuropathic-Pain-Diagnosis-Simulator). Simulation of the real-life diagnosis with three levels of variables, including the symptom-level, radiculopathy-level and the pathophysiology-level. 92 | - **id_name.txt** From [Neuropathic Pain Diagnosis Simulator](https://github.com/TURuibo/Neuropathic-Pain-Diagnosis-Simulator). It stores variable names. 93 | - **Neuropathic_Pain_Diagnosis_v20240119.ipynb** This is the notebook to generate clinical notes from tabular data *SimulatedData_SampleSize100.csv*. *Only symptom-level vairables are used.* 94 | - **neuro_R_shoulder_impingement.xlsx** This is the generated clinical notes. 95 | - **Formal_exp_1_Neuro_RSI symptom_only release.ipynb** This is the notebook to test COAT method with different Foundation Models. 96 | 97 | 98 | ``` -------------------------------------------------------------------------------- /utils.py: -------------------------------------------------------------------------------- ```python 1 | import os 2 | 3 | class Logger(): 4 | def __init__(self, root, exp_name): 5 | # Set path 6 | self.log_root_path = os.path.join(root, f'logs/{exp_name}') 7 | self.category = ['conversation', 'factors', 'causal_graph'] 8 | 9 | # init log files 10 | if not os.path.exists(self.log_root_path): 11 | os.makedirs(self.log_root_path) 12 | 13 | def straight_write(self, cate, content, mode='a'): 14 | assert cate in self.category 15 | with open(os.path.join(self.log_root_path, cate+'.log'), mode) as file: 16 | file.write('\n'+content+'\n') 17 | 18 | def get_value_from_responce(responce): 19 | if responce is None: 20 | return 'None' 21 | if 'he value is:' not in responce: 22 | return '?' 23 | last_part = responce.split('The value is:')[-1] 24 | if '-1' in last_part: 25 | return -1 26 | elif '0' in last_part: 27 | return 0 28 | elif '1' in last_part: 29 | return 1 30 | ``` -------------------------------------------------------------------------------- /openai_utils.py: -------------------------------------------------------------------------------- ```python 1 | from openai import OpenAI 2 | import os 3 | import time 4 | class LLM_openai(): 5 | def __init__(self, logger): 6 | self.client = OpenAI() 7 | self.logger = logger 8 | self.thread_id = None 9 | 10 | def log(self, content): 11 | self.logger.straight_write('conversation', content, mode='a') 12 | 13 | def load_assistant(self, asst_id): 14 | self.asst_id = asst_id 15 | self.assistant = self.client.beta.assistants.retrieve(asst_id) 16 | self.log(f'>> Loaded OpenAI assistant {self.asst_id}') 17 | 18 | def delete_thread(self): 19 | if self.thread_id is None: 20 | return 21 | response = self.client.beta.threads.delete(self.thread_id) 22 | assert response.deleted 23 | self.thread_id = None 24 | 25 | def chat(self, content, system_instruct = None, model="gpt-4-1106-preview", **kwargs): 26 | ''' 27 | Output: 28 | - run_result 29 | - responce 30 | - files 31 | ''' 32 | # init 33 | self.delete_thread() 34 | if system_instruct is None: 35 | system_instruct = 'You are an excellently helpful AI assistant for analysis and abstraction on data.' 36 | 37 | # chat 38 | completions = self.client.chat.completions.create( 39 | model=model, 40 | messages=[ 41 | {"role": "system", "content": system_instruct}, 42 | {"role": "user", "content": content} 43 | ], 44 | **kwargs 45 | ) 46 | print(completions.id) 47 | self.log( ('-' * 5) + f'{completions.id}' + ('-' * 10)) 48 | self.log('User: \n' + content) 49 | 50 | 51 | # get responce 52 | responce = completions.choices[0].message.content 53 | self.log('ChatGPT: \n' + responce) 54 | 55 | return responce 56 | 57 | def chat_assistant(self, content, args, file_ids = []): 58 | ''' 59 | Output: 60 | - run_result 61 | - responce 62 | - files 63 | ''' 64 | # init 65 | self.delete_thread() 66 | sleep_gap = args['sleep gap'] 67 | max_wait = args['max wait'] 68 | 69 | # new thread 70 | thread = self.client.beta.threads.create( 71 | messages=[ 72 | { 73 | "role": "user", 74 | "content": content, 75 | "file_ids": file_ids 76 | } 77 | ] 78 | ) 79 | print(thread.id) 80 | self.thread_id = thread.id 81 | self.log( ('-' * 5) + f'{self.thread_id}' + ('-' * 10)) 82 | self.log('User: \n' + content) 83 | 84 | # run 85 | run_result = None 86 | run = self.client.beta.threads.runs.create( 87 | thread_id=thread.id, 88 | assistant_id=self.assistant.id 89 | ) 90 | print(run.id) 91 | for i in range(max_wait//sleep_gap+1): 92 | run = self.client.beta.threads.runs.retrieve( 93 | thread_id=thread.id, 94 | run_id=run.id 95 | ) 96 | print(i * sleep_gap, run.status) 97 | if not (run.status in ['in_progress', 'queued']): 98 | print(run.status) 99 | run_result = run.status # == 'completed' 100 | break 101 | if i >= max_wait//sleep_gap: 102 | run = self.client.beta.threads.runs.cancel( 103 | thread_id=thread.id, 104 | run_id=run.id 105 | ) 106 | run_result = 'time out' 107 | break 108 | time.sleep(sleep_gap) 109 | 110 | if run_result != 'completed': 111 | return run_result, None, None 112 | 113 | # get responce 114 | messages = self.client.beta.threads.messages.list( 115 | thread_id=thread.id 116 | ) 117 | num_responce = len(messages.data) 118 | responces = [] 119 | files = [] 120 | for i in range(num_responce-2, -1, -1): 121 | this_responce = messages.data[i].content[0].text.value 122 | responces.append(this_responce) 123 | for annotation in messages.data[i].content[0].text.annotations: 124 | if hasattr(annotation, 'file_path'): 125 | this_file_id = annotation.file_path.file_id 126 | this_file_content = self.client.files.content(this_file_id).read().decode("utf-8") 127 | files.append(this_file_content) 128 | 129 | self.log('ChatGPT: \n' + '\n'.join(responces)) 130 | if len(files) > 0: 131 | self.log('\n'.join(files)) 132 | 133 | return run_result, responces, files 134 | 135 | 136 | ``` -------------------------------------------------------------------------------- /Neuropathic/id_name.txt: -------------------------------------------------------------------------------- ``` 1 | 0: DLI C1-C2 2 | 1: DLI C2-C3 3 | 2: DLI C3-C4 4 | 3: DLI C4-C5 5 | 4: DLI C5-C6 6 | 5: DLI C6-C7 7 | 6: DLI C7-C8 8 | 7: DLI C8-T1 9 | 8: DLI L1-L2 10 | 9: DLI L2-L3 11 | 10: DLI L3-L4 12 | 11: DLI L4-L5 13 | 12: DLI L5-S1 14 | 13: DLI S1-S2 15 | 14: DLI T1-T2 16 | 15: DLI T10-T11 17 | 16: DLI T11-T12 18 | 17: DLI T12-L1 19 | 18: DLI T2-T3 20 | 19: DLI T3-T4 21 | 20: DLI T4-T5 22 | 21: DLI T5-T6 23 | 22: DLI T6-T7 24 | 23: DLI T7-T8 25 | 24: DLI T8-T9 26 | 25: DLI T9-T10 27 | 26: Kraniocervikal ledskada 28 | 27: L C2 Radiculopathy 29 | 28: R C2 Radiculopathy 30 | 29: L C3 Radiculopathy 31 | 30: R C3 Radiculopathy 32 | 31: L C4 Radiculopathy 33 | 32: R C4 Radiculopathy 34 | 33: L C5 Radiculopathy 35 | 34: R C5 Radiculopathy 36 | 35: L C6 Radiculopathy 37 | 36: R C6 Radiculopathy 38 | 37: L C7 Radiculopathy 39 | 38: R C7 Radiculopathy 40 | 39: L C8 Radiculopathy 41 | 40: R C8 Radiculopathy 42 | 41: L T1 Radiculopathy 43 | 42: R T1 Radiculopathy 44 | 43: L T2 Radiculopathy 45 | 44: R T2 Radiculopathy 46 | 45: L T3 Radiculopathy 47 | 46: R T3 Radiculopathy 48 | 47: L T4 Radiculopathy 49 | 48: R T4 Radiculopathy 50 | 49: L T5 Radiculopathy 51 | 50: R T5 Radiculopathy 52 | 51: L T6 Radiculopathy 53 | 52: R T6 Radiculopathy 54 | 53: L T7 Radiculopathy 55 | 54: R T7 Radiculopathy 56 | 55: L T8 Radiculopathy 57 | 56: R T8 Radiculopathy 58 | 57: L T9 Radiculopathy 59 | 58: R T9 Radiculopathy 60 | 59: L T10 Radiculopathy 61 | 60: R T10 Radiculopathy 62 | 61: L T11 Radiculopathy 63 | 62: R T11 Radiculopathy 64 | 63: L T12 Radiculopathy 65 | 64: R T12 Radiculopathy 66 | 65: L L1 Radiculopathy 67 | 66: R L1 Radiculopathy 68 | 67: L L2 Radiculopathy 69 | 68: R L2 Radiculopathy 70 | 69: L L3 Radiculopathy 71 | 70: R L3 Radiculopathy 72 | 71: L L4 Radiculopathy 73 | 72: R L4 Radiculopathy 74 | 73: L L5 Radiculopathy 75 | 74: R L5 Radiculopathy 76 | 75: L S1 Radiculopathy 77 | 76: R S1 Radiculopathy 78 | 77: L S2 Radiculopathy 79 | 78: R S2 Radiculopathy 80 | 79: Ibs 81 | 80: L neck problems 82 | 81: Neck pain 83 | 82: R neck 84 | 83: L tinnitus 85 | 84: L eye problems 86 | 85: L ear problems 87 | 86: R tinnitus 88 | 87: R eye problems 89 | 88: R ear problems 90 | 89: Headache 91 | 90: L jaw problems 92 | 91: L forehead headache 93 | 92: Mouth 94 | 93: Forehead headache 95 | 94: R headache 96 | 95: R pta 97 | 96: Pharyngeal discomfort 98 | 97: R jaw trouble 99 | 98: Back headache 100 | 99: R back headache pain 101 | 100: L collarbone pain 102 | 101: R collarbone problems 103 | 102: Central chest pain 104 | 103: L central chest pain 105 | 104: L central chest disorders 106 | 105: R front axle problems 107 | 106: L shoulder impingement 108 | 107: R shoulder impingement 109 | 108: L shoulder problems 110 | 109: L shoulder trouble 111 | 110: R shoulder problems 112 | 111: R shoulder trouble 113 | 112: L upper arm discomfort 114 | 113: L upper elbow pain 115 | 114: Intracapular problems 116 | 115: L interscapular complaints 117 | 116: R intracapular trouble 118 | 117: L lateral elbow pain 119 | 118: L lateral arm discomfort 120 | 119: R lateral elbow pain 121 | 120: L elbow problems 122 | 121: R elbow trouble 123 | 122: L arm 124 | 123: L thumbs up 125 | 124: R thumbs up 126 | 125: L wrist problems 127 | 126: R wrist problems 128 | 127: L lower arm disorders 129 | 128: R lower arm disorders 130 | 129: L hand problems 131 | 130: R hand problems 132 | 131: L bend of arm problems 133 | 132: R armband 134 | 133: R bend of arm discomfort 135 | 134: L medial elbow problems 136 | 135: R medial elbow problems 137 | 136: L finger trouble 138 | 137: R finger trouble 139 | 138: L small finger trouble 140 | 139: R little finger trouble 141 | 140: L groin trouble 142 | 141: L medial groin disorders 143 | 142: L lateral groin discomfort 144 | 143: Central groin disorders 145 | 144: R lateral groin discomfort 146 | 145: R groin trouble 147 | 146: L adductor tendon 148 | 147: R adductor tendonitis 149 | 148: L hip disorders 150 | 149: L backache 151 | 150: Backache 152 | 151: L lumbago 153 | 152: Lumbago 154 | 153: R lumbago 155 | 154: L front thigh pain 156 | 155: R front thigh pain 157 | 156: R thigh problems 158 | 157: L leg problems 159 | 158: L thigh pain 160 | 159: R leg problems 161 | 160: R medial vadbesvär 162 | 161: L pta 163 | 162: L hip joint 164 | 163: R hip trouble 165 | 164: R hip arthritis 166 | 165: L medial knee joint disorder 167 | 166: L front knee pain 168 | 167: R medial knee joint disorder 169 | 168: R front knee pain 170 | 169: L shin 171 | 170: R shin 172 | 171: L llower leg problems 173 | 172: L knee trouble 174 | 173: R knee trouble 175 | 174: L tåledbesvär 176 | 175: L big toe problems 177 | 176: R big toe problems 178 | 177: L foot pain 179 | 178: L ankle trouble 180 | 179: R ankle trouble 181 | 180: L footstool trouble 182 | 181: R arch 183 | 182: R morton trouble 184 | 183: R fainting 185 | 184: L ischias 186 | 185: R ischias 187 | 186: L ham 188 | 187: L obesity 189 | 188: R ham 190 | 189: L toe problems 191 | 190: R foot pain 192 | 191: R tear problems 193 | 192: R obesity 194 | 193: R dorsal knee joint disorder 195 | 194: L dorsal knee joint disorder 196 | 195: L lateral knee pain 197 | 196: R lateral knee pain 198 | 197: L small toe trouble 199 | 198: L lateral foot disorders 200 | 199: R lateral foot disorders 201 | 200: R heel problems 202 | 201: Calcaneal pain 203 | 202: L heel problems 204 | 203: Coccydyni 205 | 204: L rear thigh pain 206 | 205: R rear thigh pain 207 | 206: L achilles problems 208 | 207: L achilles tendon 209 | 208: L achillodyni 210 | 209: R achilles problems 211 | 210: R achilles tendency 212 | 211: R achillodyni 213 | 212: Breast backache 214 | 213: Chest discomfort 215 | 214: L breast problems 216 | 215: R breast problems 217 | 216: Toracal dysfunction 218 | 217: Upper abdominal discomfort 219 | 218: Lateral abdominal discomfort 220 | 219: Abdominal discomfort 221 | 220: L lower abdominal discomfort 222 | 221: Lower abdominal discomfort 223 | ```