Amazon 상품 가격 예측 봇 구축기(3) - GPT-4o 파인튜닝

태그

Hugging Face

Peft

OpenAI

두근두근

오늘은 드디어 OpenAI GPT-4o 모델을 파인튜닝하여 Amazon 상품 예측 봇을 만드는 날입니다.

•

데이터 전처리

•

Baseline 모델 생성

•

LLM 파인튜닝 with GPT

•

LLM 파인튜닝 with LLAMA

바로 출발하시죠~

LLM 파인튜닝 with GPT

라이브러리 로드 및 환경 설정

먼저, GPT 파인튜닝 모델 및 평가와 관련한 라이브러리를 불러와요.

파인튜닝 및 학습/평가 로그 관리를 위해 OpenAI, Wandb 라이브러리를 가져왔습니다.

# imports

import os
import re
import json
import pickle

import matplotlib.pyplot as plt
from dotenv import load_dotenv
from huggingface_hub import login
from wandb.integration.openai.fine_tuning import WandbLogger
from openai import OpenAI

from items import Item
from testing import Tester


# Constants - used for printing to stdout in color
GREEN = "\033[92m"
YELLOW = "\033[93m"
RED = "\033[91m"
RESET = "\033[0m"
COLOR_MAP = {"red":RED, "orange": YELLOW, "green": GREEN}

%matplotlib inline
Python
복사

OpenAI API KEY 준비

OpenAI 플랫폼을 통해 LLM 모델을 파인튜닝 하려면, API KEY가 필요합니다. (OPEN_API_KEY)

시스템에서 API KEY에 대한 환경 변수를 인식할 수 있도록 합니다.

load_dotenv()
os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY', 'your-key-if-not-using-env')
Python
복사

데이터 로드

파인튜닝을 위해, 전처리 과정 때 로컬에 저장한 pickle 데이터를 사용합니다. (허깅페이스 Dataset API를 통해, 레포지토리에 업로드한 데이터를 불러올 수도 있습니다.)

# Let's avoid curating all our data again! Load in the pickle files:

with open('train.pkl', 'rb') as file:
    train = pickle.load(file)

with open('test.pkl', 'rb') as file:
    test = pickle.load(file)
Python
복사

# Remind ourselves the training prompt

print(train[0].prompt)

>> How much does this cost to the nearest dollar?

JS Route 66 Lamp with Shade
The desk lamp base is made of sturdy metal. A metal plate was rolled in to make the body of the lamp base. This durable desk lamp will last for a long time. And its quality is absolutely superb. The desk lamp base is made of sturdy metal Approximately 16 in. H x 10 in. W A metal plate was rolled in to make the body of the lamp base It uses a standard light bulb - A15 LED bulb, which is about 3-inch tall and available at Home Depot, big grocery chains, and online. We DON'T recommend the incandescent bulb due to the excessive heat produced. Please refer to our policy for the product warranty. Style Lamp, Brand JS, Color Silver, Dimensions 10 x 10 x

Price is $46.00
Python
복사

# Remind a test prompt

print(train[0].price)

>> 45.99
Python
복사

OpenAI에 따르면, gpt-4o-mini 모델을 파인튜닝할 때, 50~100여개의 데이터 포인트를 사용할 것을 권장하고 있습니다(링크).

권장 사항에 따라, 저도 100개의 학습 데이터를 샘플링 했습니다. 검증용 데이터의 경우, 학습 데이터에서 50개를 추가로 샘플링 했습니다.

fine_tune_train = train[3000:3100]
fine_tune_validation = train[200:250]

print(fine_tune_train[0].prompt)
>>>
How much does this cost to the nearest dollar?

Klein Tools 87890 Fall-Arrest/Positioning/Suspension Harness for Tree-Trimming Work, Small\nFrom the Manufacturer Since 1857, the company operated by Mathias Klein and his descendants to the fifth generation, has grown and developed along with the telecommunications and electrical industries where Klein pliers first found major usages. Today, Klein Tools, Inc. represents much more than Klein pliers. The company's product line has broadened to include virtually every major type of hand tool used in construction, electronics, mining, and general industry in addition to the electrical and telecommunications fields. Klein offers a lifetime warranty on material defects and workmanship for the normal life of the product. The Klein 87890 Fall-Arrest/Positioning/Suspension Harness is designed for

Price is $200.00
Python
복사

프롬프트 준비

Amazon 제품 가격 예측 봇이 보다 정확하게 예측할 수 있도록, 프롬프트를 준비하는 과정을 거쳤습니다.

시스템 프롬프트의 경우, 학습 단계와 평가(테스트) 단계에서 사용하는 프롬프트를 분리하여 작성하였습니다.

[학습 단계의 시스템 프롬프트]

def generate_system_message():
    prompt = """
    You are a specialist for E-Commerce Market such as Amazon.
    Your job is to estimate prices of target item with referring to description for the item.
    Reply only with the price, no explanation. 
    Price of item should be greater than ZERO. So, PLEASE DO NOT predict the price of item as exactly ZERO(DO NOT predict as $0.00).
    
    I will give you one example(User - Assistant pair).
    
    [User]
    How much does this cost?

    JS Route 66 Lamp with Shade
    The desk lamp base is made of sturdy metal. 
    A metal plate was rolled in to make the body of the lamp base. 
    This durable desk lamp will last for a long time. 
    And its quality is absolutely superb. 
    The desk lamp base is made of sturdy metal Approximately 16 in. 
    H x 10 in. 
    W A metal plate was rolled in to make the body of the lamp base It uses a standard light bulb - A15 LED bulb, which is about 3-inch tall and available at Home Depot, big grocery chains, and online. We DON'T recommend the incandescent bulb due to the excessive heat produced. 
    Please refer to our policy for the product warranty. Style Lamp, Brand JS, Color Silver, Dimensions 10 x 10 x
    
    [Assistant]
    Price is $45.99
    """
    
    return prompt
Python
복사

•

페르소나 설정: Amazon 마켓 전문가의 역할을 부여합니다. 또한, 제품 설명을 참고하여 가격을 예측하는 임무를 제공합니다.

•

제약: 답변 방식에 제약을 두었습니다. 첫 번째는 추가 설명 없이 가격만 예측하는 것입니다. 두 번째는, 0보다 큰 값으로 예측하는 것입니다. 무료 제공 제품이 아니므로, 가격은 0보다 큰 값을 가져야 하기 때문입니다. 

•

예시 제공: LLM이 자신의 임무를 정확하게 이해할 수 있도록, One-shot prompting 방식을 사용했습니다.

[평가 단계의 시스템 프롬프트]

def generate_system_message_for_test():
    prompt = """
    You are a specialist for E-Commerce Market such as Amazon.
    Your job is to estimate prices of target item with referring to description for the item.
    Reply only with the price, no explanation. 
    Price of item should be greater than ZERO. So, PLEASE DO NOT predict the price of item as exactly ZERO(DO NOT predict as $0.00).
    Also, when you predict the price of item, please consider DISTRIBUTION for price that you saw in training phase.
    """
    
    return prompt
Python
복사

•

학습 단계와 동일하게 페르소나 및 제약 사항을 설정했습니다.

•

추가적으로, 제품 가격을 예측할 때, 학습 단계에서 관찰한 제품 가격 분포를 참고할 것을 기재하였습니다.

[최종 프롬프트 구성]

최종적으로, gpt-4o-mini 모델에서 사용할 프롬프트를 다음과 같이 구성했습니다.

# 학습용 프롬프트
def messages_for(item: Item):
    system_message = generate_system_message()
    user_prompt = item.test_prompt().replace(" to the nearest dollar","").replace("\n\nPrice is $","")
    return [
        {"role": "system", "content": system_message},
        {"role": "user", "content": user_prompt},
        {"role": "assistant", "content": f"Price is ${item.price:.2f}"}
    ]
    
# 평가용 프롬프트
def messages_for_test(item: Item):
    system_message = generate_system_message_for_test()
    user_prompt = item.test_prompt().replace(" to the nearest dollar","").replace("\n\nPrice is $","")
    return [
        {"role": "system", "content": system_message},
        {"role": "user", "content": user_prompt},
        {"role": "assistant", "content": "Price is $"}
    ]
Python
복사

데이터 업로드

OpenAI에서 LLM 모델을 파인튜닝하기 위해, 메시지 집합(system, user, assistant)을 jsonl 파일로 변환한 다음, OpenAI 플랫폼에 데이터를 업로드해야 합니다.

이를 코드로 구현했습니다.

from typing import List


# Convert the items into a list of json objects - a "jsonl" string
# Each row represents a message in the form:
# {"messages" : [{"role": "system", "content": "You estimate prices...
def make_jsonl(items: List[Item]):
    result = ""
    for item in items:
        messages = messages_for(item)
        messages_str = json.dumps(messages)
        result += '{"messages": ' + messages_str +'}\n'
    return result.strip()
    

# Convert the items into jsonl and write them to a file
def write_jsonl(items, filename):
    with open(filename, "w") as f:
        jsonl = make_jsonl(items)
        f.write(jsonl)
Python
복사

write_jsonl() 함수를 호출하여, 로컬에 jsonl 파일을 저장합니다. 이후, openai.files.create() 함수를 통해, OpenAI 플랫폼에 데이터를 업로드합니다.

write_jsonl(fine_tune_train, "pricer_finetune.jsonl")
write_jsonl(fine_tune_validation, "pricer_finetune_validation.jsonl")

with open(file='pricer_finetune2.jsonl', mode='rb') as f:
    train_file = openai.files.create(file=f, purpose='fine-tune')
    
with open(file='pricer_finetune_validation2.jsonl', mode='rb') as f:
    validation_file = openai.files.create(file=f, purpose='fine-tune')
Python
복사

jsonl 파일의 샘플은 아래와 같습니다.

{"messages": [{"role": "system", "content": "You estimate prices of items. Reply only with the price, no explanation"}, {"role": "user", "content": "How much does this cost?\n\nJS Route 66 Lamp with Shade\nThe desk lamp base is made of sturdy metal. A metal plate was rolled in to make the body of the lamp base. This durable desk lamp will last for a long time. And its quality is absolutely superb. The desk lamp base is made of sturdy metal Approximately 16 in. H x 10 in. W A metal plate was rolled in to make the body of the lamp base It uses a standard light bulb - A15 LED bulb, which is about 3-inch tall and available at Home Depot, big grocery chains, and online. We DON'T recommend the incandescent bulb due to the excessive heat produced. Please refer to our policy for the product warranty. Style Lamp, Brand JS, Color Silver, Dimensions 10 x 10 x"}, {"role": "assistant", "content": "Price is $45.99"}]}
{"messages": [{"role": "system", "content": "You estimate prices of items. Reply only with the price, no explanation"}, {"role": "user", "content": "How much does this cost?\n\n4/4 Full Size Violin Case, Plush Interior Wooden Hard Case With Hygrometer, Crocodile Pattern Leather Bulge Surface Case (Black)\nHigh-quality material The case is made of high-quality wood, leather, foam, plush and hardware accessories. Very durable and sturdy, to protect your beloved violin. Sturdy shell material Durable leather and a sturdy wood shell provide the violin with a sturdy, waterproof, dust-proof storage and carrying solution. Retro style hard handle and firm lock can prevent your violin from being damaged by accident. Soft inner material Unlike the hard exterior, the inside of the violin case is made of soft plush and foam, and it is also equipped with a matching blanket, which protects your device from scratch, dents and fingerprint. Sophisticated design There is a hy"}, {"role": "assistant", "content": "Price is $79.99"}]}
{"messages": [{"role": "system", "content": "You estimate prices of items. Reply only with the price, no explanation"}, {"role": "user", "content": "How much does this cost?\n\nSupco SZO584 Refrigerator Door Gasket Replaces Sub-Zero\nSupco SZO584 Refrigerator Door Gasket, White This high-quality part is designed to meet or exceed OEM specifications. Direct replacement for Sub-Zero and About Supco Founded in 1945 in the Bronx, NY by two naval engineers, Sealed Unit Parts Co.,Inc (SUPCO) originated as a service company for refrigeration systems. We bring continued product line expansion through in-house development, master distributor relationships, and acquisition. This strengthens our position as a leader in the HVAC, Refrigeration and Appliance industries. REFRIGERATOR DOOR GASKET - This premium quality part is a direct replacement for Sub-Zero and PREMIUM REPLACEMENT - Supco SZO584 refrigerator door"}, {"role": "assistant", "content": "Price is $53.97"}]}
...
Python
복사

파인튜닝 수행

드디어.. 드디어!!!! 파인튜닝을 수행합니다. gpt-4o-mini 모델을 활용하며, OpenAI 플랫폼에 업로드한 파일을 통해 모델 학습과 검증 과정을 수행합니다.

openai.fine_tuning.jobs.create(
    model='gpt-4o-mini-2024-07-18',
    training_file=train_file.id,
    hyperparameters={
        "n_epochs": 3,
        "learning_rate_multiplier": 0.1
    },
    seed=42,
    suffix="pricer",
    validation_file=validation_file.id
)
Python
복사

학습이 완료되면, Wandb 플랫폼에 로그를 동기화합니다!

job_id = openai.fine_tuning.jobs.list(limit=1).data[0].id

# passing optional parameters
WandbLogger.sync(
    fine_tune_job_id=job_id,
    num_fine_tunes=None,
    project="Pricer-FineTune-OpenAI-Frontier",
    entity=None,
    overwrite=False,
    model_artifact_name="model-metadata",
    model_artifact_type="model"
)
Python
복사

[OpenAI 플랫폼]

[Wandb 플랫폼: 링크]

파인튜닝 모델 평가

이제 파인튜닝이 끝난 gpt-4o-mini 모델을 평가해봅시다!

얼마나 가격을 정확하게 예측할 수 있을까요?!

Baseline 모델 평가에 사용한 Tester 클래스를 이 곳에서도 동일하게 사용합니다.

모델의 종류와 무관하게, 동일한 방식으로 예측 결과를 평가하기 위해 하나의 클래스를 정의한 것입니다

fine_tuned_model_name = openai.fine_tuning.jobs.retrieve(job_id).fine_tuned_model

# A utility function to extract the price from a string
def get_price(s):
    s = s.replace('$','').replace(',','')
    match = re.search(r"[-+]?\d*\.\d+|\d+", s)
    return float(match.group()) if match else 0

# The function for gpt-4o-mini
def gpt_fine_tuned_for_test(item):
    response = openai.chat.completions.create(
        model=fine_tuned_model_name, 
        messages=messages_for_test(item),
        seed=42,
        max_tokens=7
    )
    reply = response.choices[0].message.content
    return get_price(reply)
    
Tester.test(gpt_fine_tuned_for_test, data=test)

>>>
1: Guess: $135.45 Truth: $336.28 Error: $200.83 SLE: 0.82 Item: KOHLER K-7401-K-CP Triton Centerset Lava...
2: Guess: $14.99 Truth: $3.22 Error: $11.77 SLE: 1.77 Item: HUYGHAVO Sports Pattern Designed for iPh...
3: Guess: $198.40 Truth: $182.99 Error: $15.41 SLE: 0.01 Item: WERFACTORY Tiffany Floor Lamp Green Stai...
...
248: Guess: $260.00 Truth: $429.00 Error: $169.00 SLE: 0.25 Item: Arotikee Modern Gold Contemporary Crysta...
249: Guess: $14.21 Truth: $1.04 Error: $13.17 SLE: 4.04 Item: Cosmas® 9985SN Satin Nickel Round Cabine...
250: Guess: $25.99 Truth: $13.99 Error: $12.00 SLE: 0.35 Item: welltop Egg Holder for Refrigerator, Lar...
Python
복사

Baseline 모델 목록	평균 예측 오차(MAE, $)	SLE	Hit ratio(%)
Random Model	390.09	1.97	7.2
Linear Regression Model	99.09	1.14	25.6
Linear Regression + Word2Vec	86.52	1.02	38.4
Support Vector Machine	82.29	0.91	46.8
GPT-4o-Mini with Fine-tuning	66.51	0.61	57.2
Meta-Llama-3.1-8B with Fine-tuning

Wow, Baseline 모델 중 최고였던 서포트 벡터 머신 대비, MAE 20% / SLE 33% / Hit ratio 10% 만큼 성능을 개선한 것을 확인할 수 있네요!!

정말 멋진 일입니다 :)

정리

이번 시간에는, gpt-4o-mini 모델을 파인튜닝 하고, 가격 예측 성능을 확인하는 시간을 가졌습니다. 네 가지 Baseline 모델에 비해 월등하게 성능이 개선된 것을 확인할 수 있는데요. 불과 100개의 학습 데이터를 사용하여, 성능을 개선할 수 있다는 사실에 다시 한 번 놀랐습니다

또한, 파인튜닝에 앞서 프롬프팅을 정교하게 구성하는 작업이 매우 중요하다는 것을 느끼게 되었습니다.

마지막 포스팅에서는 Open-source 모델의 대표 격인 Llama-3.1-8B 모델을 파인튜닝하고, 성능 표의 마지막 빈 값을 채워보겠습니다!!

마지막 포스팅에서 다시 만나요!

위로 올라가기

뒤로 가기