Commit 183b89bc authored by Joe Chen's avatar Joe Chen
Browse files

Initial commit

parents
No related merge requests found
Showing with 1391 additions and 0 deletions
+1391 -0
venv
\ No newline at end of file
# Customer Support Q&A System
A question answering system for customer support that uses semantic search to find relevant chat histories and generates answers based on the retrieved information.
## Features
- Semantic search using OpenAI embeddings and Qdrant vector database
- Flask API backend for handling question answering
- Modern web interface for asking questions
- Combined context from multiple chat histories for better answers
- GPT-4o-mini powered answer generation
## Prerequisites
- Python 3.x
- OpenAI API key
- Qdrant server running (URL configured in the code)
## Installation
1. Clone the repository
2. Install required dependencies:
```bash
python3 -m pip install -r requirements.txt
```
3. Make sure your OpenAI API key is set in the code (search_questions.py)
4. Ensure the Qdrant server URL is correctly configured (default: http://10.80.7.190:6333)
## Running the Application
### Quick Start
Use the provided startup script:
```bash
./start_app.sh
```
This script will check for dependencies, install them if needed, and start the server.
### Manual Start
1. Make sure the Qdrant server is running and accessible
2. Start the Flask API server:
```bash
python3 api.py
```
3. Access the web interface at [http://localhost:5000](http://localhost:5000)
## Using the Standalone HTML
If you prefer not to run the Flask server, you can use the `standalone.html` file:
1. Open `standalone.html` in your browser
2. Configure the API endpoint if needed (defaults to http://localhost:5000/api/ask)
3. Ask questions and get answers
## API Usage
The system exposes a simple API endpoint at `/api/ask` that accepts POST requests with a JSON body:
```json
{
"question": "How do I reset my password?"
}
```
The response is a JSON object with the following structure:
```json
{
"question": "How do I reset my password?",
"answer": "To reset your password, go to the login page and click on 'Forgot Password'. Then follow the instructions sent to your email.",
"results": [
{
"score": 0.89,
"question": "What's the process to reset my password?",
"answer": "Click on 'Forgot Password' on the login screen and follow email instructions.",
"chat_id": 123,
"preview": "2024-10-30 16:00:00 - Customer: How do I reset my password?..."
}
]
}
```
## Configuration
The system uses the following default settings (configured in `search_questions.py`):
- Default threshold: 0.5 (minimum similarity score)
- Default result limit: 4 (maximum number of results to return)
- Embedding model: text-embedding-3-small
- LLM for answer generation: gpt-4o-mini
## Files Overview
- `api.py` - Flask API server
- `search_questions.py` - Core search and question-answering functionality
- `main.py` - Data processing and vector database preparation
- `static/index.html` - Web interface for the Flask server
- `standalone.html` - Standalone web interface that can be used independently
- `requirements.txt` - List of required Python packages
- `start_app.sh` - Startup script to check dependencies and launch the server
\ No newline at end of file
File added
from flask import Flask, request, jsonify, send_from_directory
from flask_cors import CORS
import json
import os
from search_questions import create_embedding, connect_to_qdrant, generate_answer_from_gpt, COLLECTION_NAME, LIMIT, THRESHOLD
app = Flask(__name__, static_folder='static')
CORS(app) # Enable CORS for all routes
@app.route('/api/ask', methods=['POST'])
def ask_question():
"""API endpoint to handle question answering"""
try:
# Get the question from request
data = request.get_json()
if not data or 'question' not in data:
return jsonify({'error': 'No question provided'}), 400
query = data['question']
# Connect to Qdrant
qdrant_client = connect_to_qdrant()
if not qdrant_client:
return jsonify({'error': 'Failed to connect to vector database'}), 500
# Create embedding for the query
query_embedding = create_embedding(query)
if not query_embedding:
return jsonify({'error': 'Failed to create embedding'}), 500
# Search for similar questions
search_results = qdrant_client.search(
collection_name=COLLECTION_NAME,
query_vector=query_embedding,
limit=LIMIT,
score_threshold=THRESHOLD
)
if not search_results:
return jsonify({
'question': query,
'answer': 'I could not find any relevant information to answer your question.',
'results': []
})
# Extract chat histories from results
formatted_results = []
seen_chat_ids = set()
all_chat_histories = []
for result in search_results:
chat_id = result.payload.get('chat_unique_id')
# Skip if we've already seen this chat history
if chat_id in seen_chat_ids:
continue
seen_chat_ids.add(chat_id)
# Get chat history
chat_history = result.payload.get('chat_history', '')
all_chat_histories.append(chat_history)
formatted_results.append({
'score': result.score,
'question': result.payload.get('question', ''),
'answer': result.payload.get('answer', ''),
'chat_id': result.payload.get('chat_id', ''),
'preview': chat_history[:150] + "..." if len(chat_history) > 150 else chat_history
})
# Generate a single AI answer based on all combined chat histories
ai_answer = "No answer could be generated."
if all_chat_histories:
combined_histories = "\n\n---NEXT CONVERSATION---\n\n".join(all_chat_histories)
ai_answer = generate_answer_from_gpt(query, combined_histories)
# Return the results
response = {
'question': query,
'answer': ai_answer,
'results': formatted_results
}
return jsonify(response)
except Exception as e:
print(f"Error processing request: {e}")
return jsonify({'error': str(e)}), 500
@app.route('/', defaults={'path': ''})
@app.route('/<path:path>')
def serve_static(path):
"""Serve static files"""
if not path:
path = 'index.html'
return send_from_directory('static', path)
if __name__ == '__main__':
print("Customer Support Q&A API server running on http://localhost:5000")
app.run(debug=True, host='0.0.0.0', port=5000)
\ No newline at end of file
File added
File added
import pandas as pd
import re
# Read the Excel file
input_file = 'chat_history_with_prompts_openai-2.xlsx'
output_file = 'chat_history_cleaned.xlsx'
df = pd.read_excel(input_file)
# Get column names
column_names = df.columns.tolist()
third_column_name = column_names[2] # Get the name of the third column
# Function to remove datetime pattern (format: YYYY-MM-DD HH:MM:SS - )
def remove_datetime(text):
if isinstance(text, str):
# First remove datetime patterns
cleaned_text = re.sub(r'\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2} - ', '', text)
# Ensure there are proper line breaks between messages
# This will split by remaining datetime patterns (if any) and join with newlines
lines = re.split(r'\n\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2} - ', cleaned_text)
result = '\n\n'.join(line.strip() for line in lines if line.strip())
return result
return text
# Clean the data in the third column
df[third_column_name] = df[third_column_name].apply(remove_datetime)
# Delete the first two columns - keep only the third column
df = df[[third_column_name]]
# Save the modified DataFrame to a new Excel file
df.to_excel(output_file, index=False)
print(f"Cleaned data saved to {output_file}")
print(f"The datetime patterns have been removed and only the '{third_column_name}' column is kept.")
# Display a sample of the cleaned data
print("\nSample of cleaned data:")
print(df.head())
\ No newline at end of file
import openai
import json
import os
import pandas as pd
import re
import time
import threading
import queue
import hashlib
from concurrent.futures import ThreadPoolExecutor
from typing import List, Dict, Any
from qdrant_client import QdrantClient
from qdrant_client.http import models
from qdrant_client.http.models import PointStruct
from openai import OpenAI
key = "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"
# Set up OpenAI API key
client = OpenAI(
api_key=key,
)
# Constants for Qdrant and embeddings
EMBEDDING_MODEL = "text-embedding-3-small"
EMBEDDING_DIMENSION = 1536 # Dimension for text-embedding-3-small
COLLECTION_NAME = "customer_support_qa"
MAX_THREADS = 30 # Maximum number of concurrent API calls
API_CALL_TIMEOUT = 30 # Timeout for API calls in seconds
# Thread-safe queue for results
results_queue = queue.Queue()
# Thread-local storage for OpenAI clients
thread_local = threading.local()
def get_client():
"""Get thread-local OpenAI client"""
if not hasattr(thread_local, "client"):
thread_local.client = OpenAI(api_key=key)
return thread_local.client
def generate_chat_id(chat_history: str) -> str:
"""Generate a unique ID for a chat history using hash to eliminate duplicates"""
# Create a hash of the chat history content
hash_obj = hashlib.md5(chat_history.encode('utf-8'))
return hash_obj.hexdigest()
def extract_standalone_qa(chat_history: str) -> str:
"""Extract Q&A pairs from chat history using GPT-4o-mini"""
# System prompt with clear instructions and examples
system_prompt = """
You are an AI assistant tasked with analyzing chat histories between customers and customer care agents to identify standalone questions that an AI can answer completely based on the chat, without requiring human interaction, similar to FAQs. A standalone question is one where:
- The customer's query (explicit or implicit) is fully addressed by the agent's response.
- The response provides information or instructions that the customer can use independently.
- No further interaction within the chat is required, such as providing additional information (e.g., verification codes) or the agent performing actions needing confirmation (e.g., manual resets).
**Example 1:**
Chat history:
2024-10-30 16:00:00 - Customer: How do I update my shipping address?
2024-10-30 16:01:00 - Human Agent: To update your shipping address, go to your account settings, select 'Addresses,' and edit your shipping address there.
Standalone Q&A:
[
{"question": "How do I update my shipping address?", "answer": "To update your shipping address, go to your account settings, select 'Addresses,' and edit your shipping address there."}
]
**Example 2:**
Chat history:
2024-10-30 15:26:44 - Customer: This app is garbage! I've had an account for years but can't log in and I can't reset my password.
2024-10-30 15:27:40 - Human Agent: Hello, you've reached customer care. This is Marcela speaking. Could I have your Zmodo account email?
2024-10-30 15:29:28 - Customer: choens13@gmail.com
2024-10-30 15:31:44 - Human Agent: Okay do you have access to this email currently, I am going to try to reset it from my end, but I will need the 6 digit code from the verification email.
2024-10-30 15:31:58 - Customer: OK
Standalone Q&A:
[]
(Note: The agent's response requires a verification code, indicating human interaction, so no standalone Q&A exists.)
**Example 3:**
Chat history:
2024-10-30 17:00:00 - Customer: What are your business hours?
2024-10-30 17:01:00 - Human Agent: Our business hours are 9 AM to 5 PM, Monday to Friday.
Standalone Q&A:
[
{"question": "What are your business hours?", "answer": "Our business hours are 9 AM to 5 PM, Monday to Friday."}
]
Analyze the following chat history and identify standalone Q&A pairs. Output the result in JSON format as a list of dictionaries, each with "question" and "answer" keys. If no standalone pairs exist, return an empty list.
IMPORTANT: Your response must be valid JSON only, with no additional text outside the JSON array.
"""
# Trim chat history if it's too long
if len(chat_history) > 10000:
# #print(f"Chat history is very long ({len(chat_history)} chars), trimming...")
chat_history = chat_history[:10000] + "...[truncated]"
# User message with the chat history
user_message = f"Analyze this chat history:\n\n{chat_history}"
# Call OpenAI API
try:
local_client = get_client()
response = local_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_message}
],
temperature=0.0, # Low temperature for consistent output
timeout=API_CALL_TIMEOUT,
response_format={"type": "json_object"} # Request JSON format
)
# Extract the response
content = response.choices[0].message.content
return content
except Exception as e:
#print(f"Error in extract_standalone_qa: {e}")
return "[]"
def create_embedding(text: str) -> List[float]:
"""Create an embedding for the given text using OpenAI's embedding model."""
try:
local_client = get_client()
response = local_client.embeddings.create(
input=text,
model=EMBEDDING_MODEL,
timeout=API_CALL_TIMEOUT
)
embedding = response.data[0].embedding
return embedding
except Exception as e:
#print(f"Error creating embedding: {e}")
# Return a zero vector of the correct dimension as fallback
return [0.0] * EMBEDDING_DIMENSION
def extract_questions_from_qa_result(qa_result: str) -> List[Dict[str, str]]:
"""Extract question-answer pairs from the LLM response."""
#print("qa_result", qa_result)
# Clean up the response - sometimes GPT adds extra text before or after JSON
try:
# Try to find JSON content enclosed in square brackets
json_pattern = r'\[.*\]'
json_matches = re.search(json_pattern, qa_result, re.DOTALL)
if json_matches:
json_content = json_matches.group(0)
#print("Extracted JSON content:", json_content)
try:
qa_data = json.loads(json_content)
#print("Parsed qa_data:", qa_data)
if isinstance(qa_data, list):
return qa_data
except json.JSONDecodeError as e:
pass
#print(f"JSON parsing error after extraction: {e}")
# If bracket extraction failed, try direct parsing
try:
qa_data = json.loads(qa_result)
#print("Direct parsed qa_data:", qa_data)
if isinstance(qa_data, list):
return qa_data
except json.JSONDecodeError as e:
pass
#print(f"Direct JSON parsing error: {e}")
# If all JSON approaches fail, try regex extraction for individual pairs
pattern = r'"question":\s*"(.*?)",\s*"answer":\s*"(.*?)"'
matches = re.findall(pattern, qa_result, re.DOTALL)
if matches:
#print(f"Extracted {len(matches)} Q&A pairs via regex")
return [{"question": q, "answer": a} for q, a in matches]
# Try another pattern if the first one fails
pattern2 = r'question:\s*(.+?)\nanswer:\s*(.+?)(?:\n\n|\Z)'
matches2 = re.findall(pattern2, qa_result, re.DOTALL | re.IGNORECASE)
if matches2:
#print(f"Extracted {len(matches2)} Q&A pairs via second regex")
return [{"question": q.strip(), "answer": a.strip()} for q, a in matches2]
except Exception as e:
pass
#print(f"Error during extraction: {e}")
#print("No questions could be extracted from the response")
return []
def setup_qdrant_collection():
"""Set up a Qdrant collection for storing embeddings."""
try:
qdrant_client = QdrantClient(url="http://10.80.7.190:6333")
# Check if collection exists and recreate if needed
collections = qdrant_client.get_collections().collections
collection_names = [c.name for c in collections]
if COLLECTION_NAME in collection_names:
#print(f"Collection '{COLLECTION_NAME}' already exists, recreating it...")
qdrant_client.delete_collection(COLLECTION_NAME)
# Create collection
qdrant_client.recreate_collection(
collection_name=COLLECTION_NAME,
vectors_config=models.VectorParams(
size=EMBEDDING_DIMENSION,
distance=models.Distance.COSINE
)
)
#print(f"Collection '{COLLECTION_NAME}' created successfully")
# Test insert a dummy vector
test_vector = [0.01] * EMBEDDING_DIMENSION
try:
qdrant_client.upsert(
collection_name=COLLECTION_NAME,
points=[
PointStruct(
id=0,
vector=test_vector,
payload={"test": "test"}
)
]
)
#print("Test vector inserted successfully")
# Test search
search_result = qdrant_client.search(
collection_name=COLLECTION_NAME,
query_vector=test_vector,
limit=1
)
if search_result:
#print("Test search successful")
# Delete test vector
qdrant_client.delete(
collection_name=COLLECTION_NAME,
points_selector=models.PointIdsList(
points=[0]
)
)
except Exception as e:
#print(f"Error testing Qdrant operations: {e}")
raise
return qdrant_client
except Exception as e:
#print(f"Error setting up Qdrant: {e}")
raise
def process_chat(chat_id: int, chat_history: str, chat_unique_id: str, qdrant_client):
"""Process a single chat history in a separate thread"""
try:
#print(f"\nProcessing chat #{chat_id}:")
#print("-"*50)
#print(f"Chat content preview: {chat_history[:100]}...")
# Extract standalone Q&A pairs
qa_result = extract_standalone_qa(chat_history)
# Skip processing if the result is empty or contains error indicators
if not qa_result or qa_result.strip() == "[]":
#print(f"Chat #{chat_id}: No meaningful response from LLM")
return
qa_pairs = extract_questions_from_qa_result(qa_result)
if not qa_pairs:
#print(f"Chat #{chat_id}: No question-answer pairs could be extracted")
return
#print(f"Chat #{chat_id}: Found {len(qa_pairs)} question-answer pairs")
# Process each question-answer pair
local_results = []
for i, qa_pair in enumerate(qa_pairs):
question = qa_pair.get("question", "")
answer = qa_pair.get("answer", "")
if not question or not answer:
#print(f"Skipping incomplete QA pair: {qa_pair}")
continue
# Create a unique point ID combining chat ID and question index
point_id = f"{chat_unique_id}_{i}"
hash_point_id = int(hashlib.md5(point_id.encode()).hexdigest(), 16) % (2**63)
#print(f"Chat #{chat_id}, Question: {question}")
# Create embedding for the question
question_embedding = create_embedding(question)
# Verify embedding
if not any(question_embedding):
#print(f"Warning: Empty embedding for question: {question}")
continue
try:
# Store in Qdrant
qdrant_client.upsert(
collection_name=COLLECTION_NAME,
points=[
PointStruct(
id=hash_point_id,
vector=question_embedding,
payload={
"question": question,
"answer": answer,
"chat_history": chat_history,
"chat_id": chat_id,
"chat_unique_id": chat_unique_id
}
)
]
)
#print(f"Successfully stored embedding for question {i+1}")
local_results.append({
'chat_id': chat_id,
'chat_unique_id': chat_unique_id,
'question': question,
'answer': answer
})
except Exception as e:
pass
#print(f"Error storing embedding: {e}")
# Add all results to the queue
for result in local_results:
results_queue.put(result)
#print(f"Completed processing chat #{chat_id} with {len(local_results)} questions")
except Exception as e:
#print(f"Error processing chat #{chat_id}: {e}")
import traceback
traceback.print_exc()
def main():
# Read the cleaned Excel file
input_file = 'chat_history_cleaned.xlsx'
df = pd.read_excel(input_file)
# Since we dropped the first two columns, we should access the first column now
column_name = df.columns[0]
# Set up Qdrant
qdrant_client = setup_qdrant_collection()
# Create a dictionary to keep track of unique chat histories
unique_chats = {}
# Generate unique IDs for each chat history
for i in range(len(df)):
chat_history = df.iloc[i][column_name]
# Generate a unique ID for this chat history
chat_unique_id = generate_chat_id(chat_history)
unique_chats[i] = {
'chat_id': i+1,
'chat_history': chat_history,
'chat_unique_id': chat_unique_id
}
#print(f"Processing {len(unique_chats)} chat histories with up to {MAX_THREADS} threads")
# Process chats using ThreadPoolExecutor
with ThreadPoolExecutor(max_workers=MAX_THREADS) as executor:
# Submit tasks to the executor
futures = []
for i, chat_data in unique_chats.items():
chat_id = chat_data['chat_id']
chat_history = chat_data['chat_history']
chat_unique_id = chat_data['chat_unique_id']
future = executor.submit(
process_chat,
chat_id,
chat_history,
chat_unique_id,
qdrant_client
)
futures.append(future)
# Wait for all tasks to complete
for future in futures:
future.result()
# Collect all results from the queue
results = []
while not results_queue.empty():
results.append(results_queue.get())
# Save results to a new Excel file
# if results:
# results_df = pd.DataFrame(results)
# results_df.to_excel('extracted_questions.xlsx', index=False)
# #print(f"\nProcessed {len(results)} questions. Results saved to 'extracted_questions.xlsx'")
# else:
# #print("No questions were extracted from the chat histories.")
# # Example search in Qdrant
# if results:
# example_question = results[0]['question']
# #print(f"\nExample search for: '{example_question}'")
# search_embedding = create_embedding(example_question)
# search_results = qdrant_client.search(
# collection_name=COLLECTION_NAME,
# query_vector=search_embedding,
# limit=3
# )
# #print("\nSearch results:")
# for result in search_results:
# #print(f"Score: {result.score:.4f}")
# #print(f"Question: {result.payload['question']}")
# #print(f"Answer: {result.payload['answer']}")
# #print(f"Chat ID: {result.payload['chat_id']} (Unique ID: {result.payload['chat_unique_id']})")
# #print("-"*30)
if __name__ == "__main__":
main()
\ No newline at end of file
File added
flask==3.1.0
flask-cors==5.0.1
openai==1.51.2
pandas==2.2.3
qdrant-client==1.13.3
python-dateutil==2.9.0.post0
numpy==2.2.4
requests==2.32.3
tqdm==4.66.5
import openai
import json
import pandas as pd
from typing import List, Dict, Any
from qdrant_client import QdrantClient
from qdrant_client.http import models
# Constants
QDRANT_URL = "http://10.80.7.190:6333"
COLLECTION_NAME = "customer_support_qa"
EMBEDDING_MODEL = "text-embedding-3-small"
LIMIT = 4 # Number of results to return
THRESHOLD = 0.7 # Default similarity threshold
# OpenAI API key - same as in main.py
from openai import OpenAI
key = "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"
client = OpenAI(api_key=key)
def create_embedding(text: str) -> List[float]:
"""Create an embedding for the given text using OpenAI's embedding model."""
try:
response = client.embeddings.create(
input=text,
model=EMBEDDING_MODEL
)
embedding = response.data[0].embedding
return embedding
except Exception as e:
print(f"Error creating embedding: {e}")
return None
def connect_to_qdrant():
"""Connect to Qdrant and return the client."""
try:
qdrant_client = QdrantClient(url=QDRANT_URL)
print("Connected to Qdrant successfully")
# Get collection info
collection_info = qdrant_client.get_collection(COLLECTION_NAME)
print(f"Collection '{COLLECTION_NAME}' info:")
print(f" - Vector size: {collection_info.config.params.vectors.size}")
print(f" - Distance: {collection_info.config.params.vectors.distance}")
return qdrant_client
except Exception as e:
print(f"Error connecting to Qdrant: {e}")
return None
def generate_answer_from_gpt(query: str, chat_history: str):
"""Generate an answer based on the chat history and query using GPT-4.1-mini."""
try:
system_prompt = """
You are a customer support AI assistant. Your task is to provide DIRECT and DETAILED answers to customer questions.
IMPORTANT GUIDELINES:
1. ONLY answer what was specifically asked in the question
2. Use ONLY information from the provided chat history
3. Do NOT add any extra information or explanations that weren't requested
4. Your answer should be brief and to the point
5. If the chat history doesn't contain information relevant to the question, say "I don't have enough information to answer that question"
6. Do NOT introduce yourself or add pleasantries at the start or end
Remember: Stay focused on the exact question asked.
"""
user_message = f"""
Customer question: {query}
Chat history:
{chat_history}
Provide ONLY the specific answer to the question, with no additional information.
"""
response = client.chat.completions.create(
model="gpt-4.1-mini", # Using gpt-4o-mini as the closest available model
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_message}
],
temperature=0.3
)
return response.choices[0].message.content
except Exception as e:
print(f"Error generating answer from GPT: {e}")
return "Unable to generate an answer at this time."
def search_similar_questions(query: str, limit: int = LIMIT, threshold: float = THRESHOLD):
"""Search for similar questions in the vector database."""
qdrant_client = connect_to_qdrant()
if not qdrant_client:
return []
print(f"Creating embedding for query: '{query}'")
query_embedding = create_embedding(query)
if not query_embedding:
return []
print(f"Searching for similar questions (limit: {limit}, threshold: {threshold})")
try:
search_results = qdrant_client.search(
collection_name=COLLECTION_NAME,
query_vector=query_embedding,
limit=limit,
score_threshold=threshold
)
if not search_results:
print("No results found")
return []
print(f"Found {len(search_results)} results")
# Process and return results
formatted_results = []
seen_chat_ids = set() # To track unique chat histories
all_chat_histories = [] # Collect all relevant chat histories
for result in search_results:
chat_id = result.payload.get('chat_unique_id')
# Skip if we've already seen this chat history
if chat_id in seen_chat_ids:
continue
seen_chat_ids.add(chat_id)
# Get chat history
chat_history = result.payload.get('chat_history', '')
all_chat_histories.append(chat_history)
formatted_results.append({
'score': result.score,
'question': result.payload.get('question', ''),
'answer': result.payload.get('answer', ''),
'chat_id': result.payload.get('chat_id', ''),
})
# Generate a single AI answer based on all combined chat histories
if all_chat_histories:
combined_histories = "\n\n---NEXT CONVERSATION---\n\n".join(all_chat_histories)
ai_answer = generate_answer_from_gpt(query, combined_histories)
print(f"AI Generated Answer:")
print(f"{ai_answer}")
return formatted_results
except Exception as e:
print(f"Error searching Qdrant: {e}")
return []
def display_results(results):
"""Display search results in a readable format."""
if not results:
print("No matching results found.")
return
def save_results_to_excel(results, filename="search_results.xlsx"):
"""Save search results to an Excel file."""
if not results:
print("No results to save.")
return
df = pd.DataFrame(results)
df.to_excel(filename, index=False)
print(f"Results saved to {filename}")
def main():
while True:
query = input("\nEnter your search query (or 'quit' to exit): ")
if query.lower() in ['quit', 'exit', 'q']:
break
if not query.strip():
print("Please enter a valid query.")
continue
# Perform search with default parameters
results = search_similar_questions(query)
# Display results
display_results(results)
print("Thank you for using Customer Support Question Search!")
if __name__ == "__main__":
main()
\ No newline at end of file
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Customer Support Q&A</title>
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet">
<style>
body {
font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
background-color: #f8f9fa;
padding: 20px;
}
.container {
max-width: 800px;
margin: 0 auto;
}
.question-form {
background-color: white;
padding: 30px;
border-radius: 10px;
box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
margin-bottom: 20px;
}
.answer-container {
background-color: white;
padding: 30px;
border-radius: 10px;
box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
margin-bottom: 20px;
display: none;
}
.source-results {
margin-top: 30px;
}
.result-card {
margin-bottom: 15px;
border-left: 4px solid #0d6efd;
}
.loading {
display: none;
text-align: center;
margin: 20px 0;
}
.spinner-border {
width: 3rem;
height: 3rem;
}
h1 {
color: #0d6efd;
margin-bottom: 25px;
}
.answer-text {
padding: 15px;
background-color: #f8f9fa;
border-radius: 5px;
border-left: 4px solid #198754;
}
.error-message {
color: #dc3545;
margin-top: 15px;
display: none;
}
.api-settings {
margin-top: 20px;
padding: 15px;
background-color: #f8f9fa;
border-radius: 5px;
display: none;
}
.settings-toggle {
cursor: pointer;
color: #0d6efd;
text-decoration: underline;
font-size: 0.9rem;
}
</style>
</head>
<body>
<div class="container">
<h1 class="text-center">Customer Support Q&A</h1>
<div class="question-form">
<form id="questionForm">
<div class="mb-3">
<label for="questionInput" class="form-label">Ask a question:</label>
<input type="text" class="form-control form-control-lg" id="questionInput"
placeholder="How do I reset my password?">
</div>
<button type="submit" class="btn btn-primary btn-lg">Get Answer</button>
<div class="mt-3">
<span class="settings-toggle" id="settingsToggle">⚙️ API Settings</span>
</div>
</form>
<div class="api-settings" id="apiSettings">
<div class="mb-3">
<label for="apiEndpoint" class="form-label">API Endpoint:</label>
<input type="text" class="form-control" id="apiEndpoint" value="http://localhost:5000/api/ask">
</div>
<button class="btn btn-sm btn-outline-secondary" id="saveSettings">Save Settings</button>
</div>
<div class="error-message" id="errorMessage"></div>
<div class="loading" id="loadingSpinner">
<div class="spinner-border text-primary" role="status">
<span class="visually-hidden">Loading...</span>
</div>
<p class="mt-2">Searching for answers...</p>
</div>
</div>
<div class="answer-container" id="answerContainer">
<h2>Answer:</h2>
<div class="answer-text" id="answerText"></div>
<div class="source-results" id="sourceResults">
<h3>Source Conversations:</h3>
<div id="resultsList"></div>
</div>
</div>
</div>
<script>
document.addEventListener('DOMContentLoaded', function() {
const questionForm = document.getElementById('questionForm');
const questionInput = document.getElementById('questionInput');
const loadingSpinner = document.getElementById('loadingSpinner');
const answerContainer = document.getElementById('answerContainer');
const answerText = document.getElementById('answerText');
const resultsList = document.getElementById('resultsList');
const errorMessage = document.getElementById('errorMessage');
const settingsToggle = document.getElementById('settingsToggle');
const apiSettings = document.getElementById('apiSettings');
const apiEndpoint = document.getElementById('apiEndpoint');
const saveSettings = document.getElementById('saveSettings');
// Load saved API endpoint from localStorage
if (localStorage.getItem('apiEndpoint')) {
apiEndpoint.value = localStorage.getItem('apiEndpoint');
}
// Toggle settings panel
settingsToggle.addEventListener('click', function() {
apiSettings.style.display = apiSettings.style.display === 'none' ? 'block' : 'none';
});
// Save settings
saveSettings.addEventListener('click', function() {
localStorage.setItem('apiEndpoint', apiEndpoint.value);
apiSettings.style.display = 'none';
showError('Settings saved!', 'success');
});
questionForm.addEventListener('submit', function(e) {
e.preventDefault();
const question = questionInput.value.trim();
if (!question) {
showError('Please enter a question.');
return;
}
// Get API endpoint from settings
const endpoint = apiEndpoint.value.trim();
if (!endpoint) {
showError('Please enter a valid API endpoint in settings.');
return;
}
// Reset UI
errorMessage.style.display = 'none';
answerContainer.style.display = 'none';
loadingSpinner.style.display = 'block';
// Send API request
fetch(endpoint, {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({ question: question })
})
.then(response => {
if (!response.ok) {
throw new Error('Network response was not ok');
}
return response.json();
})
.then(data => {
loadingSpinner.style.display = 'none';
// Display answer
answerText.innerHTML = data.answer;
// Display source results
resultsList.innerHTML = '';
if (data.results && data.results.length > 0) {
data.results.forEach((result, index) => {
const resultCard = document.createElement('div');
resultCard.className = 'card result-card';
const cardBody = document.createElement('div');
cardBody.className = 'card-body';
const relevanceText = document.createElement('small');
relevanceText.className = 'text-muted d-block mb-2';
relevanceText.textContent = `Relevance: ${(result.score * 100).toFixed(2)}%`;
const questionText = document.createElement('h5');
questionText.className = 'card-title';
questionText.textContent = result.question;
const answerText = document.createElement('p');
answerText.className = 'card-text';
answerText.textContent = result.answer;
const previewTitle = document.createElement('p');
previewTitle.className = 'card-text mt-2 mb-1 fw-bold';
previewTitle.textContent = 'Conversation preview:';
const previewText = document.createElement('p');
previewText.className = 'card-text small text-muted';
previewText.textContent = result.preview;
cardBody.appendChild(relevanceText);
cardBody.appendChild(questionText);
cardBody.appendChild(answerText);
cardBody.appendChild(previewTitle);
cardBody.appendChild(previewText);
resultCard.appendChild(cardBody);
resultsList.appendChild(resultCard);
});
} else {
resultsList.innerHTML = '<p>No source conversations found.</p>';
}
answerContainer.style.display = 'block';
})
.catch(error => {
loadingSpinner.style.display = 'none';
showError('Error fetching answer: ' + error.message);
console.error('Error:', error);
});
});
function showError(message, type = 'error') {
errorMessage.textContent = message;
errorMessage.style.display = 'block';
if (type === 'success') {
errorMessage.style.color = '#198754';
} else {
errorMessage.style.color = '#dc3545';
}
// Hide after 3 seconds if it's a success message
if (type === 'success') {
setTimeout(() => {
errorMessage.style.display = 'none';
}, 3000);
}
}
});
</script>
</body>
</html>
\ No newline at end of file
#!/bin/bash
echo "Starting Customer Support Q&A System"
echo "-----------------------------------"
# Check if Python 3 is installed
if ! command -v python3 &> /dev/null; then
echo "Error: Python 3 is not installed or not in PATH"
exit 1
fi
# Check if requirements are installed
echo "Checking dependencies..."
if ! python3 -c "import flask, openai, pandas, qdrant_client" 2>/dev/null; then
echo "Installing dependencies..."
python3 -m pip install -r requirements.txt
fi
# Start the API server
echo "Starting API server on http://localhost:5000"
echo "Press Ctrl+C to stop the server"
python3 api.py
# This line will only execute if the server stops
echo "Server stopped."
\ No newline at end of file
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Customer Support Q&A</title>
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet">
<style>
body {
font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
background-color: #f8f9fa;
padding: 20px;
}
.container {
max-width: 800px;
margin: 0 auto;
}
.question-form {
background-color: white;
padding: 30px;
border-radius: 10px;
box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
margin-bottom: 20px;
}
.answer-container {
background-color: white;
padding: 30px;
border-radius: 10px;
box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
margin-bottom: 20px;
display: none;
}
.source-results {
margin-top: 30px;
}
.result-card {
margin-bottom: 15px;
border-left: 4px solid #0d6efd;
}
.loading {
display: none;
text-align: center;
margin: 20px 0;
}
.spinner-border {
width: 3rem;
height: 3rem;
}
h1 {
color: #0d6efd;
margin-bottom: 25px;
}
.answer-text {
padding: 15px;
background-color: #f8f9fa;
border-radius: 5px;
border-left: 4px solid #198754;
}
.error-message {
color: #dc3545;
margin-top: 15px;
display: none;
}
</style>
</head>
<body>
<div class="container">
<h1 class="text-center">Customer Support Q&A</h1>
<div class="question-form">
<form id="questionForm">
<div class="mb-3">
<label for="questionInput" class="form-label">Ask a question:</label>
<input type="text" class="form-control form-control-lg" id="questionInput"
placeholder="How do I reset my password?">
</div>
<button type="submit" class="btn btn-primary btn-lg">Get Answer</button>
</form>
<div class="error-message" id="errorMessage"></div>
<div class="loading" id="loadingSpinner">
<div class="spinner-border text-primary" role="status">
<span class="visually-hidden">Loading...</span>
</div>
<p class="mt-2">Searching for answers...</p>
</div>
</div>
<div class="answer-container" id="answerContainer">
<h2>Answer:</h2>
<div class="answer-text" id="answerText"></div>
<div class="source-results" id="sourceResults">
<h3>Source Conversations:</h3>
<div id="resultsList"></div>
</div>
</div>
</div>
<script>
document.addEventListener('DOMContentLoaded', function() {
const questionForm = document.getElementById('questionForm');
const questionInput = document.getElementById('questionInput');
const loadingSpinner = document.getElementById('loadingSpinner');
const answerContainer = document.getElementById('answerContainer');
const answerText = document.getElementById('answerText');
const resultsList = document.getElementById('resultsList');
const errorMessage = document.getElementById('errorMessage');
questionForm.addEventListener('submit', function(e) {
e.preventDefault();
const question = questionInput.value.trim();
if (!question) {
showError('Please enter a question.');
return;
}
// Reset UI
errorMessage.style.display = 'none';
answerContainer.style.display = 'none';
loadingSpinner.style.display = 'block';
// Send API request
fetch('/api/ask', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({ question: question })
})
.then(response => {
if (!response.ok) {
throw new Error('Network response was not ok');
}
return response.json();
})
.then(data => {
loadingSpinner.style.display = 'none';
// Display answer
answerText.innerHTML = data.answer;
// Display source results
resultsList.innerHTML = '';
if (data.results && data.results.length > 0) {
data.results.forEach((result, index) => {
const resultCard = document.createElement('div');
resultCard.className = 'card result-card';
const cardBody = document.createElement('div');
cardBody.className = 'card-body';
const relevanceText = document.createElement('small');
relevanceText.className = 'text-muted d-block mb-2';
relevanceText.textContent = `Relevance: ${(result.score * 100).toFixed(2)}%`;
const questionText = document.createElement('h5');
questionText.className = 'card-title';
questionText.textContent = result.question;
const answerText = document.createElement('p');
answerText.className = 'card-text';
answerText.textContent = result.answer;
const previewTitle = document.createElement('p');
previewTitle.className = 'card-text mt-2 mb-1 fw-bold';
previewTitle.textContent = 'Conversation preview:';
const previewText = document.createElement('p');
previewText.className = 'card-text small text-muted';
previewText.textContent = result.preview;
cardBody.appendChild(relevanceText);
cardBody.appendChild(questionText);
cardBody.appendChild(answerText);
cardBody.appendChild(previewTitle);
cardBody.appendChild(previewText);
resultCard.appendChild(cardBody);
resultsList.appendChild(resultCard);
});
} else {
resultsList.innerHTML = '<p>No source conversations found.</p>';
}
answerContainer.style.display = 'block';
})
.catch(error => {
loadingSpinner.style.display = 'none';
showError('Error fetching answer: ' + error.message);
console.error('Error:', error);
});
});
function showError(message) {
errorMessage.textContent = message;
errorMessage.style.display = 'block';
}
});
</script>
</body>
</html>
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment