SOL CoD Phase 4: Gemini 2.5 API 통합 아키텍처
📋 문서 개요
작성일: 2025-09-03
버전: 1.0.0
담당: sol_cod_architect
승인: 대기 중
🎯 목표
- Gemini 2.5 API를 SOL CoD 시스템의 4개 전문가 에이전트와 통합
- Clean Architecture 원칙을 준수한 LLM Service Layer 설계
- LangGraph.js StateGraph와 Gemini API 완벽 통합
- Production 보안 및 성능 요구사항 충족
🏗️ 아키텍처 개요
1. 시스템 아키텍처 구조
2. Gemini API 통합 아키텍처
핵심 설계 원칙:
- Hexagonal Architecture: Gemini API를 Port & Adapter 패턴으로 통합
- Repository Pattern: LLM 호출을 Repository로 추상화
- Strategy Pattern: 다양한 LLM Provider 대응 가능
- Circuit Breaker: API 장애 시 자동 복구
- Rate Limiting: 비용 최적화 및 API 제한 준수
🔧 핵심 컴포넌트 설계
1. LLM Service Layer 아키텍처
// Domain Layer - LLM Service Interface
interface LLMServicePort {
generateResponse(prompt: string, options: LLMOptions): Promise<LLMResponse>;
validateResponse(response: LLMResponse): Promise<boolean>;
getTokenUsage(): TokenUsage;
isHealthy(): Promise<boolean>;
}
// Infrastructure Layer - Gemini Service Implementation
@Injectable()
export class GeminiLLMService implements LLMServicePort {
constructor(
private readonly geminiClient: GeminiClient,
private readonly tokenManager: TokenManagerService,
private readonly rateLimiter: RateLimiterService,
private readonly circuitBreaker: CircuitBreakerService
) {}
}
2. Expert Agent + Gemini 통합 패턴
// BaseExpertAgent에 LLM 통합
export abstract class BaseExpertAgent {
constructor(
// 기존 dependencies...
private readonly llmService: LLMServicePort,
private readonly promptManager: PromptManagerService
) {}
protected async performAnalysis(preprocessedData: any): Promise<ExpertAnalysis> {
// 1. 전문가별 프롬프트 생성
const prompt = await this.generateAnalysisPrompt(preprocessedData);
// 2. Gemini API 호출 (Circuit Breaker + Rate Limiting)
const llmResponse = await this.llmService.generateResponse(prompt, {
model: 'gemini-2.5-pro',
maxTokens: 2048,
temperature: 0.1,
expertType: this.agentType
});
// 3. 응답 구조화 및 검증
return await this.parseAndValidateResponse(llmResponse);
}
}
3. LangGraph StateGraph + Gemini 통합
// StateGraph에서 Gemini API 통합
export class SOLCoDStateGraphService {
private buildStateGraph(): StateGraph {
return new StateGraphBuilder()
.addNode('analysis_phase', async (state) => {
// 4개 Expert Agent가 병렬로 Gemini API 호출
const analyses = await Promise.all(
this.expertAgents.map(agent =>
agent.executeAsNode(state) // 내부적으로 Gemini API 사용
)
);
return { ...state, currentAnalyses: analyses };
})
.addNode('debate_phase', async (state) => {
// Expert Agent 간 토론도 Gemini API 기반
const debates = await this.conductDebateRound(state);
return { ...state, debateHistory: debates };
})
// ... 기타 노드들
.build();
}
}
🔐 보안 및 키 관리 방안
1. API 키 보안 관리
// 환경별 키 관리 전략
interface GeminiConfig {
apiKey: string; // Secret Manager에서 로드
projectId: string; // GCP Project ID
region: string; // 'europe-west3' (GDPR 준수)
endpoint?: string; // Custom endpoint if needed
retryConfig: RetryConfig;
circuitBreakerConfig: CircuitBreakerConfig;
}
@Injectable()
export class GeminiConfigService {
private readonly config: GeminiConfig;
constructor(private readonly secretManager: SecretManagerService) {
this.config = {
apiKey: this.secretManager.getSecret('GEMINI_API_KEY'),
projectId: process.env.GCP_PROJECT_ID,
region: 'europe-west3',
retryConfig: { maxRetries: 3, backoffMs: 1000 },
circuitBreakerConfig: { threshold: 5, timeout: 30000 }
};
}
}
2. API 키 로테이션 및 모니터링
// API 키 상태 모니터링
@Injectable()
export class GeminiMonitoringService {
async checkApiKeyHealth(): Promise<HealthCheckResult> {
try {
// 간단한 API 호출로 키 유효성 확인
await this.geminiService.generateResponse('Health check', {
maxTokens: 10,
model: 'gemini-2.5-pro'
});
return { healthy: true, lastChecked: new Date() };
} catch (error) {
return {
healthy: false,
error: error.message,
lastChecked: new Date()
};
}
}
}
💰 Token 비용 최적화 전략
1. Token Usage 추적 및 관리
@Injectable()
export class TokenManagerService {
private readonly dailyBudget = 10000; // 일일 토큰 예산
private readonly userTokenLimits = new Map<string, number>();
async trackTokenUsage(
userId: string,
agentType: ExpertAgentType,
promptTokens: number,
completionTokens: number
): Promise<void> {
const totalTokens = promptTokens + completionTokens;
// 사용자별 토큰 사용량 추적
await this.redisClient.hincrby(`tokens:${userId}:daily`, agentType, totalTokens);
// 전체 시스템 토큰 사용량 추적
await this.redisClient.hincrby('tokens:system:daily', 'total', totalTokens);
// 비용 계산 (Gemini 2.5 Pro 기준)
const cost = this.calculateCost(promptTokens, completionTokens);
await this.redisClient.hincrby('costs:daily', 'gemini', cost);
}
private calculateCost(promptTokens: number, completionTokens: number): number {
// Gemini 2.5 Pro 가격 (추정)
const PROMPT_COST_PER_1K = 0.001; // $0.001 per 1K tokens
const COMPLETION_COST_PER_1K = 0.002; // $0.002 per 1K tokens
return (promptTokens / 1000) * PROMPT_COST_PER_1K +
(completionTokens / 1000) * COMPLETION_COST_PER_1K;
}
}
2. 프롬프트 최적화 시스템
// 각 전문가별 최적화된 프롬프트 템플릿
export class ExpertPromptTemplates {
// Sleep Pattern Analyst용 최적화 프롬프트 (토큰 효율성 고려)
static readonly SLEEP_ANALYSIS_TEMPLATE: PromptTemplate = {
id: 'sleep_analysis_v2_optimized',
template: `Analyze sleep patterns for SOL prediction:
User Data: {{sleepDataSummary}}
Questionnaire: {{questionnaireSummary}}
Provide structured response:
1. SOL Score (0-120 min): [number]
2. Confidence (0.0-1.0): [number]
3. Key Factors: [max 3 bullet points]
4. Reasoning: [max 100 words]
Focus on sleep latency indicators only.`,
tokenEstimate: 150, // 예상 입력 토큰
maxOutputTokens: 300,
version: '2.0'
};
// CBT-I Behavior Expert용 최적화 프롬프트
static readonly CBTI_ANALYSIS_TEMPLATE: PromptTemplate = {
id: 'cbti_analysis_v2_optimized',
template: `CBT-I behavioral analysis for SOL prediction:
Sleep Behaviors: {{behaviorDataSummary}}
Previous SOL: {{previousSOL}}
Response format:
1. SOL Score: [number]
2. Confidence: [number]
3. CBT-I Factors: [max 3]
4. Behavioral Issues: [max 2]
5. Recommendations: [max 2]
Be concise, focus on behavioral patterns.`,
tokenEstimate: 120,
maxOutputTokens: 250,
version: '2.0'
};
}
🚦 Rate Limiting 및 Error Handling
1. 계층화된 Rate Limiting 전략
@Injectable()
export class RateLimiterService {
private readonly limits = {
perUser: { requests: 100, windowMs: 3600000 }, // 시간당 100회
perAgent: { requests: 50, windowMs: 3600000 }, // 에이전트당 시간당 50회
system: { requests: 1000, windowMs: 3600000 } // 시스템 전체 시간당 1000회
};
async checkRateLimit(
userId: string,
agentType: ExpertAgentType
): Promise<RateLimitResult> {
const userKey = `rate_limit:user:${userId}`;
const agentKey = `rate_limit:agent:${agentType}`;
const systemKey = 'rate_limit:system';
const [userCount, agentCount, systemCount] = await Promise.all([
this.redisClient.incr(userKey),
this.redisClient.incr(agentKey),
this.redisClient.incr(systemKey)
]);
// TTL 설정 (첫 요청 시)
if (userCount === 1) await this.redisClient.expire(userKey, 3600);
if (agentCount === 1) await this.redisClient.expire(agentKey, 3600);
if (systemCount === 1) await this.redisClient.expire(systemKey, 3600);
return {
allowed: userCount <= this.limits.perUser.requests &&
agentCount <= this.limits.perAgent.requests &&
systemCount <= this.limits.system.requests,
remainingRequests: Math.min(
this.limits.perUser.requests - userCount,
this.limits.perAgent.requests - agentCount,
this.limits.system.requests - systemCount
)
};
}
}
2. Circuit Breaker 패턴 구현
@Injectable()
export class CircuitBreakerService {
private readonly breakers = new Map<string, CircuitBreaker>();
async executeWithCircuitBreaker<T>(
serviceName: string,
operation: () => Promise<T>
): Promise<T> {
const breaker = this.getOrCreateBreaker(serviceName);
if (breaker.state === 'OPEN') {
const timeSinceLastFailure = Date.now() - breaker.lastFailureTime;
if (timeSinceLastFailure < breaker.timeout) {
throw new ServiceUnavailableException(
`Circuit breaker is OPEN for ${serviceName}`
);
} else {
breaker.state = 'HALF_OPEN';
}
}
try {
const result = await operation();
if (breaker.state === 'HALF_OPEN') {
breaker.state = 'CLOSED';
breaker.failureCount = 0;
}
return result;
} catch (error) {
breaker.failureCount++;
breaker.lastFailureTime = Date.now();
if (breaker.failureCount >= breaker.threshold) {
breaker.state = 'OPEN';
}
throw error;
}
}
}
📊 프롬프트 최적화 및 성능 튜닝
1. 4개 전문가별 프롬프트 전략
// 1. Sleep Pattern Analyst - 수면 패턴 분석 전문가
export class SleepPatternAnalystAgent extends BaseExpertAgent {
protected initializePromptTemplates(): void {
this.registerPromptTemplate({
id: 'sleep_initial_analysis',
template: `As a sleep pattern specialist, analyze the following data for SOL prediction:
**Sleep Metrics (Last 7 days)**:
- Average bedtime: {{avgBedtime}}
- Average sleep onset: {{avgSleepOnset}}
- Sleep efficiency: {{sleepEfficiency}}%
- Wake episodes: {{avgWakeEpisodes}}
**Current Factors**:
{{sleepDataSummary}}
**Task**: Predict tonight's Sleep Onset Latency (SOL) in minutes.
**Response Format** (JSON):
{
"solScore": number (0-120),
"confidenceScore": number (0.0-1.0),
"keyFactors": [string, string, string],
"analysis": "Brief explanation focusing on sleep patterns (max 50 words)",
"riskFactors": [string, string]
}
Be precise and data-driven.`,
tokenEstimate: 200,
maxOutputTokens: 300,
version: '2.1'
});
}
}
// 2. Psychological State Analyst - 심리 상태 분석 전문가
export class PsychologicalStateAnalystAgent extends BaseExpertAgent {
protected initializePromptTemplates(): void {
this.registerPromptTemplate({
id: 'psychological_analysis',
template: `As a sleep psychology expert, assess psychological factors affecting SOL:
**Mental Health Indicators**:
- PHQ-9 Score: {{phq9Score}} (Depression screening)
- GAD-7 Score: {{gad7Score}} (Anxiety screening)
- Stress Level (1-10): {{stressLevel}}
- Sleep Anxiety: {{sleepAnxiety}}
**Behavioral Patterns**:
{{psychologicalDataSummary}}
**Previous SOL**: {{previousSOL}} minutes
Predict SOL considering psychological state:
**Response Format** (JSON):
{
"solScore": number,
"confidenceScore": number,
"keyFactors": ["psychological factor 1", "factor 2", "factor 3"],
"analysis": "Psychology-focused explanation (max 50 words)",
"interventions": ["suggestion 1", "suggestion 2"]
}
Focus on mind-sleep connection.`,
tokenEstimate: 180,
maxOutputTokens: 280,
version: '2.1'
});
}
}
// 3. CBT-I Sleep Behavior Expert - 인지행동치료 전문가
export class CBTISleepBehaviorExpertAgent extends BaseExpertAgent {
protected initializePromptTemplates(): void {
this.registerPromptTemplate({
id: 'cbti_behavior_analysis',
template: `As a CBT-I specialist, evaluate sleep behaviors impacting SOL:
**Sleep Hygiene Behaviors**:
- Pre-sleep routine: {{preSleepRoutine}}
- Screen time before bed: {{screenTime}} minutes
- Caffeine intake: {{caffeineIntake}}
- Exercise timing: {{exerciseTiming}}
**CBT-I Compliance**:
- Sleep restriction adherence: {{sleepRestriction}}%
- Stimulus control: {{stimulusControl}}%
- Sleep diary completion: {{sleepDiaryRate}}%
**Problematic Behaviors**: {{behavioralIssues}}
Predict SOL from CBT-I perspective:
**Response Format** (JSON):
{
"solScore": number,
"confidenceScore": number,
"keyFactors": ["behavior 1", "behavior 2", "behavior 3"],
"analysis": "CBT-I focused explanation (max 50 words)",
"behavioralTargets": ["target 1", "target 2"],
"complianceScore": number (0.0-1.0)
}
Emphasize behavioral modifications.`,
tokenEstimate: 220,
maxOutputTokens: 320,
version: '2.1'
});
}
}
// 4. Digital Sleep Environment Expert - 디지털 환경 전문가
export class DigitalSleepEnvironmentExpertAgent extends BaseExpertAgent {
protected initializePromptTemplates(): void {
this.registerPromptTemplate({
id: 'digital_environment_analysis',
template: `As a digital sleep environment specialist, analyze tech impact on SOL:
**Sleep Environment**:
- Room temperature: {{roomTemp}}°C
- Light exposure: {{lightLevel}} lux
- Noise level: {{noiseLevel}} dB
- Air quality index: {{airQuality}}
**Digital Factors**:
- Device usage pattern: {{deviceUsage}}
- Blue light exposure: {{blueLightHours}} hours/day
- Sleep app engagement: {{appEngagement}}%
- Smart device interference: {{smartDevices}}
**Environmental Data**: {{environmentalSummary}}
Predict SOL considering digital environment:
**Response Format** (JSON):
{
"solScore": number,
"confidenceScore": number,
"keyFactors": ["env factor 1", "factor 2", "factor 3"],
"analysis": "Environment-focused explanation (max 50 words)",
"environmentalRisks": ["risk 1", "risk 2"],
"optimizationScore": number (0.0-1.0)
}
Focus on environmental optimization.`,
tokenEstimate: 200,
maxOutputTokens: 300,
version: '2.1'
});
}
}
2. Chain of Debate 최적화 프롬프트
// 토론 단계에서 사용하는 최적화된 프롬프트
export class DebatePromptTemplates {
static readonly DEBATE_ROUND_TEMPLATE = `**Chain of Debate - Round {{roundNumber}}**
**My Analysis**: SOL {{mySOL}}min (confidence: {{myConfidence}})
**Other Expert Opinions**:
{{otherAnalyses}}
**Disagreement Points**:
{{disagreements}}
As {{agentType}}, defend or revise your prediction:
**Response Format** (JSON):
{
"revisedSOL": number (if changed),
"revisedConfidence": number (if changed),
"counterArguments": ["argument 1", "argument 2"],
"supportingEvidence": ["evidence 1", "evidence 2"],
"finalStance": "maintain/revise",
"reasoning": "Brief explanation (max 40 words)"
}
Be concise but convincing.`;
static readonly CONSENSUS_BUILDING_TEMPLATE = `**Consensus Building Phase**
**Expert Predictions**:
{{expertPredictions}}
**Debate History**:
{{debateHistory}}
**Convergence Analysis**:
- Standard deviation: {{standardDeviation}}
- Agreement level: {{agreementLevel}}%
Build final consensus:
**Response Format** (JSON):
{
"finalSOLScore": number,
"consensusConfidence": number,
"consensusFactors": ["factor 1", "factor 2", "factor 3"],
"expertAgreement": number (0.0-1.0),
"reasoning": "Consensus explanation (max 60 words)",
"uncertaintyAreas": ["area 1", "area 2"]
}`;
}
🔍 성능 모니터링 및 최적화
1. Gemini API 성능 모니터링
@Injectable()
export class GeminiPerformanceMonitor {
private readonly metrics = {
totalRequests: 0,
totalTokens: 0,
totalCost: 0,
averageLatency: 0,
errorRate: 0,
expertAgentUsage: new Map<ExpertAgentType, number>()
};
async recordAPICall(
agentType: ExpertAgentType,
promptTokens: number,
completionTokens: number,
latencyMs: number,
success: boolean
): Promise<void> {
this.metrics.totalRequests++;
this.metrics.totalTokens += promptTokens + completionTokens;
// Expert Agent별 사용량 추적
const currentUsage = this.metrics.expertAgentUsage.get(agentType) || 0;
this.metrics.expertAgentUsage.set(agentType, currentUsage + 1);
// 평균 레이턴시 업데이트
this.metrics.averageLatency =
(this.metrics.averageLatency * (this.metrics.totalRequests - 1) + latencyMs) /
this.metrics.totalRequests;
// 비용 추적
this.metrics.totalCost += this.calculateCost(promptTokens, completionTokens);
// 에러율 업데이트
if (!success) {
this.metrics.errorRate = (this.metrics.errorRate + 1) / this.metrics.totalRequests;
}
// Redis에 메트릭 저장
await this.saveMetricsToRedis();
}
async getPerformanceReport(): Promise<PerformanceReport> {
return {
...this.metrics,
costPerPrediction: this.metrics.totalCost / this.metrics.totalRequests,
tokensPerPrediction: this.metrics.totalTokens / this.metrics.totalRequests,
expertEfficiency: Array.from(this.metrics.expertAgentUsage.entries())
.map(([agent, usage]) => ({ agent, usage, efficiency: usage / this.metrics.totalRequests }))
};
}
}
2. 프롬프트 A/B 테스팅 시스템
@Injectable()
export class PromptOptimizationService {
private readonly activeTests = new Map<string, PromptABTest>();
async createABTest(
agentType: ExpertAgentType,
promptVariants: PromptTemplate[]
): Promise<string> {
const testId = `ab_test_${agentType}_${Date.now()}`;
this.activeTests.set(testId, {
id: testId,
agentType,
variants: promptVariants,
results: new Map(),
startTime: Date.now(),
sampleSize: 100 // 100개 예측 후 분석
});
return testId;
}
async selectPromptVariant(testId: string): Promise<PromptTemplate> {
const test = this.activeTests.get(testId);
if (!test) throw new Error(`AB test not found: ${testId}`);
// Round-robin 또는 확률적 선택
const variantIndex = Math.floor(Math.random() * test.variants.length);
return test.variants[variantIndex];
}
async recordTestResult(
testId: string,
variantId: string,
accuracy: number,
tokenUsage: number,
latency: number
): Promise<void> {
const test = this.activeTests.get(testId);
if (!test) return;
const results = test.results.get(variantId) || [];
results.push({ accuracy, tokenUsage, latency, timestamp: Date.now() });
test.results.set(variantId, results);
// 충분한 샘플 수집 시 통계 분석
if (results.length >= test.sampleSize) {
await this.analyzeABTestResults(testId);
}
}
}
🌐 Production 배포 및 운영
1. 환경별 설정 관리
// production.config.ts
export const productionGeminiConfig: GeminiConfig = {
apiKey: '${SECRET_MANAGER}', // Secret Manager에서 로드
projectId: 'dta-wide-prod',
region: 'europe-west3',
endpoint: 'https://generativelanguage.googleapis.com',
rateLimits: {
requestsPerMinute: 60,
tokensPerMinute: 150000,
dailyBudget: 1000000 // 일일 100만 토큰 한도
},
circuitBreaker: {
failureThreshold: 5,
resetTimeout: 30000,
monitoringWindow: 60000
},
retry: {
maxRetries: 3,
backoffStrategy: 'exponential',
initialDelayMs: 1000
}
};
// development.config.ts
export const developmentGeminiConfig: GeminiConfig = {
apiKey: 'AIzaSyCnAGVcrSvfy4UlteJV39cDOIF7sK7ki98',
projectId: 'dta-wide-dev',
region: 'europe-west3',
rateLimits: {
requestsPerMinute: 30,
tokensPerMinute: 50000,
dailyBudget: 100000 // 일일 10만 토큰 한도
}
};
2. 모니터링 및 알람 시스템
@Injectable()
export class GeminiAlertingService {
private readonly alertThresholds = {
errorRate: 0.05, // 5% 에러율 초과 시 알람
latency: 10000, // 10초 초과 시 알람
dailyCost: 100, // 일일 100달러 초과 시 알람
tokenUsage: 0.8 // 일일 한도의 80% 사용 시 알람
};
async checkAlertConditions(): Promise<void> {
const metrics = await this.performanceMonitor.getPerformanceReport();
// 에러율 체크
if (metrics.errorRate > this.alertThresholds.errorRate) {
await this.sendAlert('HIGH_ERROR_RATE', {
currentRate: metrics.errorRate,
threshold: this.alertThresholds.errorRate
});
}
// 레이턴시 체크
if (metrics.averageLatency > this.alertThresholds.latency) {
await this.sendAlert('HIGH_LATENCY', {
currentLatency: metrics.averageLatency,
threshold: this.alertThresholds.latency
});
}
// 비용 체크
if (metrics.totalCost > this.alertThresholds.dailyCost) {
await this.sendAlert('COST_THRESHOLD_EXCEEDED', {
currentCost: metrics.totalCost,
threshold: this.alertThresholds.dailyCost
});
}
}
private async sendAlert(type: string, data: any): Promise<void> {
// Slack, Email, 또는 GCP 알람으로 전송
await this.notificationService.send({
type: 'GEMINI_API_ALERT',
priority: 'HIGH',
message: `Gemini API Alert: ${type}`,
data
});
}
}
📈 성능 목표 및 검증
1. 성능 목표 설정
export const GeminiPerformanceTargets = {
// 응답 시간 목표
maxLatencyMs: 8000, // 8초 이내 응답
averageLatencyMs: 3000, // 평균 3초 응답
// 정확도 목표
minAccuracy: 0.80, // 80% 이상 정확도
targetAccuracy: 0.85, // 목표 85% 정확도
// 가용성 목표
uptime: 0.999, // 99.9% 가동률
maxErrorRate: 0.01, // 1% 이하 에러율
// 비용 목표
maxCostPerPrediction: 0.05, // 예측당 5센트 이하
dailyBudgetLimit: 500, // 일일 500달러 이하
// 토큰 효율성
maxTokensPerPrediction: 2000, // 예측당 최대 2000토큰
targetTokensPerPrediction: 1200 // 목표 1200토큰
};
2. 성능 검증 테스트
@Injectable()
export class GeminiPerformanceValidator {
async validatePerformanceTargets(): Promise<ValidationReport> {
const metrics = await this.performanceMonitor.getPerformanceReport();
return {
latencyCheck: {
target: GeminiPerformanceTargets.averageLatencyMs,
actual: metrics.averageLatency,
passed: metrics.averageLatency <= GeminiPerformanceTargets.averageLatencyMs
},
accuracyCheck: {
target: GeminiPerformanceTargets.minAccuracy,
actual: await this.calculateAccuracy(),
passed: await this.calculateAccuracy() >= GeminiPerformanceTargets.minAccuracy
},
costCheck: {
target: GeminiPerformanceTargets.maxCostPerPrediction,
actual: metrics.costPerPrediction,
passed: metrics.costPerPrediction <= GeminiPerformanceTargets.maxCostPerPrediction
},
tokenEfficiencyCheck: {
target: GeminiPerformanceTargets.targetTokensPerPrediction,
actual: metrics.tokensPerPrediction,
passed: metrics.tokensPerPrediction <= GeminiPerformanceTargets.maxTokensPerPrediction
}
};
}
}
🔄 마이그레이션 및 롤아웃 전략
1. 단계적 롤아웃 계획
export class GeminiMigrationPlan {
static readonly ROLLOUT_PHASES = [
{
phase: 1,
name: 'Development Testing',
percentage: 0,
duration: '1 week',
criteria: 'Internal testing with synthetic data'
},
{
phase: 2,
name: 'Canary Release',
percentage: 5,
duration: '1 week',
criteria: '5% of production traffic'
},
{
phase: 3,
name: 'Limited Production',
percentage: 25,
duration: '2 weeks',
criteria: '25% traffic with performance monitoring'
},
{
phase: 4,
name: 'Full Production',
percentage: 100,
duration: 'Ongoing',
criteria: 'All traffic after validation'
}
];
}
@Injectable()
export class GeminiRolloutService {
async getCurrentRolloutPhase(): Promise<number> {
const configValue = await this.configService.get('GEMINI_ROLLOUT_PERCENTAGE');
return parseInt(configValue, 10) || 0;
}
async shouldUseGemini(userId: string): Promise<boolean> {
const rolloutPercentage = await this.getCurrentRolloutPhase();
if (rolloutPercentage === 0) return false; // 아직 비활성화
if (rolloutPercentage === 100) return true; // 전체 활성화
// 사용자 ID 기반 일관된 할당
const userHash = this.hashUserId(userId);
return userHash % 100 < rolloutPercentage;
}
}
📋 구현 체크리스트
Phase 4.1: 기본 인프라 구축 (1주)
- GeminiLLMService 구현 (Port & Adapter 패턴)
- TokenManagerService 구현 (비용 추적)
- RateLimiterService 구현 (API 제한)
- CircuitBreakerService 구현 (장애 복구)
- 환경별 설정 관리 시스템
Phase 4.2: Expert Agent 통합 (2주)
- BaseExpertAgent + Gemini 통합
- 4개 전문가별 최적화 프롬프트 구현
- Chain of Debate 프롬프트 시스템
- 응답 파싱 및 검증 로직
- LangGraph StateGraph + Gemini 통합
Phase 4.3: 성능 최적화 (1주)
- 프롬프트 A/B 테스팅 시스템
- 성능 모니터링 대시보드
- 알림 및 알람 시스템
- 비용 최적화 자동화
- 토큰 사용량 최적화
Phase 4.4: 프로덕션 준비 (1주)
- 보안 검토 및 키 관리
- 단계적 롤아웃 시스템
- 성능 목표 검증 테스트
- 백업 및 복구 계획
- 운영 문서화
🎯 성공 기준
기술적 성공 기준
- 응답 시간: 평균 3초 이내, 최대 8초 이내
- 정확도: 80% 이상 SOL 예측 정확도 달성
- 가용성: 99.9% 이상 서비스 가동률
- 비용 효율성: 예측당 5센트 이하 비용
품질 속성 달성
- Clean Architecture 준수: 100% 의존성 역전 원칙 적용
- 보안: API 키 보안 관리 및 암호화 통신
- 확장성: 일일 10,000회 예측 처리 가능
- 모니터링: 실시간 성능 추적 및 알림
비즈니스 성공 기준
- 의료진 만족도: 4.5/5.0 이상 예측 품질 평가
- 시스템 안정성: 월간 99.5% 이상 정상 서비스
- 비용 통제: 월간 예산 내 운영 비용 유지
- 규정 준수: GDPR 및 의료정보보호법 100% 준수
작성자: sol_cod_architect
검토 요청: @sol_cod_pm
최종 업데이트: 2025-09-03