Pinecone + OpenAI 사용하여 RAG패턴 구현하기

왜 RAG 패턴을 사용할까?

생성형 인공지능, 특히 텍스트 생성 모델에서의 '할루시네이션 현상' 모델이 잘못된 정보를 생성하거나, 존재하지 않는 사실을 마치 사실인 것처럼 제시하는 것을 말합니다. 이는 다음과 같은 이유로 발생할 수 있습니다:

훈련 데이터의 한계: 모델이 훈련된 데이터에 오류가 있거나, 특정 주제에 대한 정보가 부족할 경우, 모델은 부정확한 정보를 생성할 수 있습니다.
문맥 이해의 한계: 현재의 AI 모델은 심층적인 이해나 추론 능력이 제한적입니다. 따라서 복잡한 문맥이나 미묘한 뉘앙스를 항상 정확하게 파악하지 못할 수 있습니다.
과도한 일반화: 모델이 너무 광범위한 데이터에 기반하여 학습하면, 때때로 특정 상황이나 예외적인 경우를 제대로 처리하지 못할 수 있습니다.
생성 로직의 한계: 생성 모델은 주로 확률적으로 가장 가능성 있는 텍스트를 생성하려고 합니다. 이 과정에서 때때로 비현실적이거나 사실과 부합하지 않는 결과를 낼 수 있습니다.

이 할루시네이션 현장을 위해 장기 기억과 같은 “외부 지식 기반”의 벡터 데이터베이스를 사용하게됩니다. 이 벡터데이터베이스를 생성형 ai와 결합한 패턴 형태를 RAG패턴이라고 부르며, 이를 통해 환각(할루시네이션)현상을 어느정도 보완할 수 있습니다.

OpenAI, PineconeDB를 활용한 벡터 임베딩/서치

공식문서가 너무 오래되어 현재 openai 라이브러리의 사용방법이 많이 변경되었습니다. 때문에 공식문서를 기반으로 새롭게 가이드를 작성하였습니다. jupyter notebook 환경에서 진행하였습니다.

라이브러리 설치하기

!pip install -qU openai pinecone-client datasets

import openai
import pinecone

pinecone.__version__

'2.2.4'

openai.__version__

'1.3.6'

OpenAI API Key설정하기

from openai import OpenAI
import os
client = OpenAI(api_key="sk-xx")

gpt3.5에 질문하기

query = "Which training method should I use for sentence transformers when " + "I only have pairs of related sentences?" + "answer must be korean"

res = client.chat.completions.create(
    model="gpt-3.5-turbo-1106",
    messages=[
        {
            "role": "user",
            "content": f"{query}",
        },
    ],
)
print(res.choices[0].message.content)

sentence transformers 모델을 훈련시킬 때, 짝을 이루는 관련 문장들만 가지고 있다면 siamese network나 triplet loss와 같은 contrastive learning 방법을 사용하는 것이 좋습니다. 이 방법은 관련 있는 문장 쌍을 잘 분류하고 임베딩하는데 효과적입니다.

벡터임베딩 진행하기

임베딩 모델로는 "text-embedding-ada-002"를 사용하도록 하겠습니다.

embed_model = "text-embedding-ada-002"

res = client.embeddings.create(
    input = [
        query
    ],
    model = embed_model
)

response를 출력해보겠습니다.

res

CreateEmbeddingResponse(data=[Embedding(embedding=[-0.027723563835024834, -0.019584406167268753, 0.03897269070148468, -0.005519496742635965, -0.003617791924625635, 0.024949805811047554, 0.013224378228187561, -0.008755546994507313, -0.0076348367147147655, -0.04236283898353577, 0.016124214977025986, 0.05572730302810669, 0.004454822279512882, 0.013420501723885536, 0.00665071327239275, 0.019570399075746536, 0.011543313041329384, 0.007501752581447363, 0.01629232056438923, -0.019654450938105583, -0.019962646067142487, 0.0011662387987598777, -0.010730798356235027, -0.028424007818102837, -0.0050677102990448475, -0.014310065656900406, 0.027219243347644806, -0.026336684823036194, -0.019430309534072876, -0.01646042801439762, 0.02670091576874256, 0.010128416121006012, -0.018729865550994873, -0.010478638112545013, 0.013238387182354927, -0.0011776210740208626, 0.014891433529555798, 0.018183520063757896, 0.023674998432397842, -0.017987394705414772, 0.014092927798628807, 0.014583238400518894, 0.004391782451421022, -0.036114878952503204, 0.007256597280502319, 0.015367736108601093, -0.005701612215489149, -0.013028253801167011, -0.01913612335920334 … ], index=0, object='embedding')], model='text-embedding-ada-002-v2', object='list', usage=Usage(prompt_tokens=17, total_tokens=17))

사이즈 측정해보겠습니다. 사이즈는 임베딩 모델에 따라서 결정되게됩니다. text-embedding-ada-002 는 1536의 길이를 가집니다.

res.data[0].embedding

[-0.027723563835024834,-0.019584406167268753,0.03897269070148468,-0.005519496742635965,-0.003617791924625635,0.024949805811047554,0.013224378228187561,-0.008755546994507313,-0.0076348367147147655,-0.04236283898353577, …]

len(res.data[0].embedding)

1536

임베딩 데이터 로드

임베딩을 진행할 데이터를 가져오도록 하겠습니다. 가져오는 데이터는 youtube의 caption 데이터입니다. https://huggingface.co/datasets/jamescalam/youtube-transcriptions

title(제목)

published(게시 시간)

url

video_id(https://www.youtube.com/watch?v={id})

channel_id(채널아이디)

id(유니크한 데이터)

text(실제 캡션데이터)

start(시작시간)

end(끝나는시간)

from datasets import load_dataset

data = load_dataset('jamescalam/youtube-transcriptions', split='train')
data

Dataset({ features: ['title', 'published', 'url', 'video_id', 'channel_id', 'id', 'text', 'start', 'end'], num_rows: 208619 })

첫번째 데이터를 가져와보겠습니다

data[0]

{'title': 'Training and Testing an Italian BERT - Transformers From Scratch #4', 'published': '2021-07-06 13:00:03 UTC', 'url': 'https://youtu.be/35Pdoyi6ZoQ', 'video_id': '35Pdoyi6ZoQ', 'channel_id': 'UCv83tO5cePwHMt1952IVVHw', 'id': '35Pdoyi6ZoQ-t0.0', 'text': 'Hi, welcome to the video.', 'start': 0.0, 'end': 9.36}

저장된 청크가 너무 작은것 같아서 이를 합치는 과정이 필요할것 같습니다. 이때 tqdm이라는 패키지를 사용해보도록 하겠습니다. tqdm으로 20개의 문장을 concat하는데, 이때 stride를 4로 두어 곂치는 문장은 4개로 생성되도록 하겠습니다.

from tqdm.auto import tqdm

new_data = []

window = 20  # number of sentences to combine
stride = 4  # number of sentences to 'stride' over, used to create overlap

for i in tqdm(range(0, len(data), stride)):
    i_end = min(len(data)-1, i+window)
    if data[i]['title'] != data[i_end]['title']:
        # in this case we skip this entry as we have start/end of two videos
        continue
    text = ' '.join(data[i:i_end]['text'])
    # create the new merged dataset
    new_data.append({
        'start': data[i]['start'],
        'end': data[i_end]['end'],
        'title': data[i]['title'],
        'text': text,
        'id': data[i]['id'],
        'url': data[i]['url'],
        'published': data[i]['published'],
        'channel_id': data[i]['channel_id']
    })

100% 52155/52155 [00:18<00:00, 2842.63it/s]

새롭게 생성한 데이터 1개를 확인해보도록 하겠습니다.

new_data[0]

{'start': 0.0, 'end': 74.12, 'title': 'Training and Testing an Italian BERT - Transformers From Scratch #4', 'text': "Hi, welcome to the video. So this is the fourth video in a Transformers from Scratch mini series. So if you haven't been following along, we've essentially covered what you can see on the screen. So we got some data. We built a tokenizer with it. And then we've set up our input pipeline ready to begin actually training our model, which is what we're going to cover in this video. So let's move over to the code. And we see here that we have essentially everything we've done so far. So we've built our input data, our input pipeline. And we're now at a point where we have a data loader, PyTorch data loader, ready. And we can begin training a model with it. So there are a few things to be aware of. So I mean, first, let's just have a quick look at the structure of our data.", 'id': '35Pdoyi6ZoQ-t0.0', 'url': 'https://youtu.be/35Pdoyi6ZoQ', 'published': '2021-07-06 13:00:03 UTC', 'channel_id': 'UCv83tO5cePwHMt1952IVVHw'}

파인콘을 사용한 벡터 임베딩

이제 이 text에 대한 내용을 벡터화 하여 pineconeDB 에 벡터로 저장하고, 나머지 항목들에 대해서는 metadata 항목으로 저장하도록 하겠습니다. 앞에서 우리는 이미 openAI의 임베딩 모델을 사용해보았기 때문에, 이제는 이 데이터를 저장(인덱싱) 을 시도해보겠습니다. 일단은 인덱스를 생성해보도록 하겠습니다

import pinecone

index_name = 'youtube-transcriptions'

# initialize connection to pinecone (get API key at app.pinecone.io)
pinecone.init(
    api_key="xx",
    environment="gcp-starter"  # may be different, check at app.pinecone.io
)

# check if index already exists (it shouldn't if this is first time)
if index_name not in pinecone.list_indexes():
    # if does not exist, create index
    pinecone.create_index(
        index_name,
        dimension=len(res.data[0].embedding),
        metric='cosine',
        metadata_config={'indexed': ['channel_id', 'published']}
    )
# connect to index
index = pinecone.Index(index_name)
# view index stats
index.describe_index_stats()

{'dimension': 1536, 'index_fullness': 0.0, 'namespaces': {}, 'total_vector_count': 0}

인덱스를 생성만 한 상태이고, vector갯수가 0개임에 유의해야합니다. 이제 이 인덱스에 new data를 임베딩하여 저장해보도록 하겠습니다. 소요시간은 약 25분정도 소요되었습니다.

from time import sleep

batch_size = 100  # how many embeddings we create and insert at once

for i in tqdm(range(0, len(new_data), batch_size)):
    # find end of batch
    i_end = min(len(new_data), i+batch_size)
    meta_batch = new_data[i:i_end]
    # get ids
    ids_batch = [x['id'] for x in meta_batch]
    # get texts to encode
    texts = [x['text'] for x in meta_batch]
    # create embeddings (try-except added to avoid RateLimitError)
    done = False
    while not done:
        try:
            res = client.embeddings.create(input = texts, model = embed_model)
            done = True
        except:
            sleep(5)
    embeds = [record.embedding for record in res.data]
    # cleanup metadata
    meta_batch = [{
        'start': x['start'],
        'end': x['end'],
        'title': x['title'],
        'text': x['text'],
        'url': x['url'],
        'published': x['published'],
        'channel_id': x['channel_id']
    } for x in meta_batch]
    to_upsert = list(zip(ids_batch, embeds, meta_batch))
    # upsert to Pinecone
    index.upsert(vectors=to_upsert)

100% 487/487 [23:10<00:00, 2.27s/it]

한번 저장된 데이터를 확인해보도록 하겠습니다. 파인콘은 순차적인 인덱싱을 사용하지 않고, 대신 유니크한 아이디에 의해 참조할 수 있습니다. 1번째로 데이터가 들어갔을 "35Pdoyi6ZoQ-t0.0" 아이디를 가진 데이터를 조회해보도록 하겠습니다.

index.fetch(ids=["35Pdoyi6ZoQ-t0.0"]).vectors["35Pdoyi6ZoQ-t0.0"].values

[-0.0104372036, -0.0183759257, -0.00420779176, -0.039465256, 0.00777753023, 0.00864393916, 0.00778424647, -0.0228624456, -0.00636373926, -0.0150714833, 0.0382025838, 0.0172744449, -0.0420174673, -0.00905363634, 0.0156759545, -0.0121162906, 0.00916109793, 0.00553427031, …]

index.fetch(ids=["35Pdoyi6ZoQ-t0.0"]).vectors["35Pdoyi6ZoQ-t0.0"].metadata

{'channel_id': 'UCv83tO5cePwHMt1952IVVHw', 'end': 74.12, 'published': datetime.datetime(2021, 7, 6, 13, 0, 3, tzinfo=tzutc()), 'start': 0.0, 'text': "Hi, welcome to the video. So this is the fourth video in a Transformers from Scratch mini series. So if you haven't been following along, we've essentially covered what you can see on the screen. So we got some data. We built a tokenizer with it. And then we've set up our input pipeline ready to begin actually training our model, which is what we're going to cover in this video. So let's move over to the code. And we see here that we have essentially everything we've done so far. So we've built our input data, our input pipeline. And we're now at a point where we have a data loader, PyTorch data loader, ready. And we can begin training a model with it. So there are a few things to be aware of. So I mean, first, let's just have a quick look at the structure of our data.", 'title': 'Training and Testing an Italian BERT - Transformers From Scratch #4', 'url': 'https://youtu.be/35Pdoyi6ZoQ'}

벡터서치

이제 모든 준비는 끝났습니다. RAG패턴을 적용하여 한번 테스트를 진행해보도록 하겠습니다.

이를 위한 retrieve 함수를 작성하겠습니다.

limit = 3750

def retrieve(query):
    res = client.embeddings.create(input = query, model = embed_model)

    # retrieve from Pinecone
    xq = res.data[0].embedding

    # get relevant contexts
    res = index.query(xq, top_k=3, include_metadata=True)
    contexts = [
        x['metadata']['text'] for x in res['matches']
    ]

    # build our prompt with the retrieved contexts included
    prompt_start = (
        "Answer the question based on the context below. reply must be korean\\n\\n"+
        "Context:\\n"
    )
    prompt_end = (
        f"\\n\\nQuestion: {query}\\nAnswer:"
    )
    # append contexts until hitting limit
    for i in range(1, len(contexts)):
        if len("\\n\\n---\\n\\n".join(contexts[:i])) >= limit:
            prompt = (
                prompt_start +
                "\\n\\n---\\n\\n".join(contexts[:i-1]) +
                prompt_end
            )
            break
        elif i == len(contexts)-1:
            prompt = (
                prompt_start +
                "\\n\\n---\\n\\n".join(contexts) +
                prompt_end
            )
    return prompt

question = "Which training method should I use for sentence transformers when " + "I only have pairs of related sentences?"
query_with_contexts = retrieve(question)
query_with_contexts

"Answer the question based on the context below. reply must be korean\n\nContext:\npairs of related sentences you can go ahead and actually try training or fine-tuning using NLI with multiple negative ranking loss. If you don't have that fine. Another option is that you have a semantic textual similarity data set or STS and what this is is you have so you have sentence A here, sentence B here and then you have a score from from 0 to 1 that tells you the similarity between those two scores and you would train this using something like cosine similarity loss. Now if that's not an option and your focus or use case is on building a sentence transformer for another language where there is no current sentence transformer you can use multilingual parallel data. So what I mean by that is so parallel data just means translation pairs so if you have for example a English sentence and then you have another language here so it can it can be anything I'm just going to put XX and that XX is your target language you can fine-tune a model using something called multilingual knowledge distillation and what that does is takes a monolingual model for example in English and using those translation pairs it distills the knowledge the semantic similarity knowledge from that monolingual English model into a multilingual model which can handle both English and your target language. So they're three options quite popular very common that you can go for and as a supervised methods the chances are that probably going to outperform anything you do with unsupervised training at least for now. So if none of those sound like something\n\n---\n\nwere actually more accurate. So we can't really do that. We can't use this what is called a mean pooling approach. Or we can't use it in its current form. Now the solution to this problem was introduced by two people in 2019 Nils Reimers and Irenia Gurevich. They introduced what is the first sentence transformer or sentence BERT. And it was found that sentence BERT or S BERT outformed all of the previous Save the Art models on pretty much all benchmarks. Not all of them but most of them. And it did it in a very quick time. So if we compare it to BERT, if we wanted to find the most similar sentence pair from 10,000 sentences in that 2019 paper they found that with BERT that took 65 hours. With S BERT embeddings they could create all the embeddings in just around five seconds. And then they could compare all those with cosine similarity in 0.01 seconds. So it's a lot faster. We go from 65 hours to just over five seconds which is I think pretty incredible. Now I think that's pretty much all the context we need behind sentence transformers. And what we do now is dive into a little bit of how they actually work. Now we said before we have the core transform models and what S BERT does is fine tunes on sentence pairs using what is called a Siamese architecture or Siamese network. What we mean by a Siamese network is that we have what we can see, what can view as two BERT models that are identical and the weights between those two models are tied. Now in reality when implementing this we just use a single BERT model. And what we do is we process one sentence, a sentence A through the model and then we process another sentence, sentence B through the model. And that's the sentence pair. So with our cross-linked we were processing the sentence pair together. We were putting them both together, processing them all at once. This time we process them separately. And during training what happens is the weights\n\n---\n\nfrom all of them, not really random sampling, just taking all the possible pairs, we end up with nine new or nine pairs in total, which is much better if you extend that a little further. So from just a thousand pairs, we can end up with one million pairs. So you can see quite quickly, you can take a small data set and very quickly create a big data set with it. Now this is just one part of the problem though, because our smaller data set will have similarity scores or natural language inference labels, but the new data set that we've just created, the augmented data set, doesn't have any of those, just randomly sampled new sentence pairs. So there's no scores or labels there and we need those to actually train and model. So what we can do is take a slightly different approach or add another step into here. Now that other set is using something called a cross encoder. So in semantic similarity, we can use two different types of models. We can use a cross encoder, which is over here, or we can use a bi-encoder or what I would usually call a sentence transporter. Now a cross encoder is the sort of old way of doing it and it works by simply putting sentence A and sentence B into a BERT model together at once. So we have sentence A, separate a token, sentence B, feed that into a BERT model and from that BERT model we will get all of our embeddings, output embeddings over here and they all get fed into a linear layer, which converts all of those into a similarity score up here. Now that similarity score is typically going to be more accurate than a similarity score that you get from a bi-encoder or a sentence transformer. But the problem here is from our sentence transformer we are outputting sentence vectors and if we have two sentence vectors we can perform a cosine similarity or\n\nQuestion: Which training method should I use for sentence transformers when I only have pairs of related sentences?\nAnswer:"

이제 context가 생성되었으니 이를 다시 GPT에 물어보도록 하겠습니다. 이를 위한 complete 함수를 작성합니다.

def complete(prompt):
    res = client.chat.completions.create(
        model="gpt-3.5-turbo-1106",
        messages=[
            {
                "role": "user",
                "content": f"{prompt}",
            },
        ],
        temperature=0,
        max_tokens=400,
        top_p=1,
        frequency_penalty=0,
        presence_penalty=0,
        stop=None
    )
    return res.choices[0].message.content

complete(query_with_contexts)

'당신이 관련된 문장 쌍만 가지고 있을 때 문장 변환기를 위해 어떤 훈련 방법을 사용해야 하나요? \n답변: 다중 부정 순위 손실을 사용하여 NLI를 훈련하거나 세맨틱 텍스트 유사성 데이터 세트 또는 STS를 사용하여 코사인 유사성 손실을 사용하여 훈련할 수 있습니다. 또는 다국어 병렬 데이터를 사용하여 모델을 미세 조정할 수 있습니다.'

답변비교하기

기본 gpt3.5

gpt3.5 + pinecone Vector (RAG)

다중 부정 순위 손실을 사용하여 NLI를 훈련하거나 세맨틱 텍스트 유사성 데이터 세트 또는 STS를 사용하여 코사인 유사성 손실을 사용하여 훈련할 수 있습니다. 또는 다국어 병렬 데이터를 사용하여 모델을 미세 조정할 수 있습니다.

아래의 Rag패턴은 다음 3개의 데이터를 참고하였습니다.

pairs of related sentences you can go ahead and actually try training or fine-tuning using NLI with multiple negative ranking loss. If you don't have that fine. Another option is that you have a semantic textual similarity data set or STS and what this is is you have so you have sentence A here, sentence B here and then you have a score from from 0 to 1 that tells you the similarity between those two scores and you would train this using something like cosine similarity loss. Now if that's not an option and your focus or use case is on building a sentence transformer for another language where there is no current sentence transformer you can use multilingual parallel data. So what I mean by that is so parallel data just means translation pairs so if you have for example a English sentence and then you have another language here so it can it can be anything I'm just going to put XX and that XX is your target language you can fine-tune a model using something called multilingual knowledge distillation and what that does is takes a monolingual model for example in English and using those translation pairs it distills the knowledge the semantic similarity knowledge from that monolingual English model into a multilingual model which can handle both English and your target language. So they're three options quite popular very common that you can go for and as a supervised methods the chances are that probably going to outperform anything you do with unsupervised training at least for now. So if none of those sound like something

were actually more accurate. So we can't really do that. We can't use this what is called a mean pooling approach. Or we can't use it in its current form. Now the solution to this problem was introduced by two people in 2019 Nils Reimers and Irenia Gurevich. They introduced what is the first sentence transformer or sentence BERT. And it was found that sentence BERT or S BERT outformed all of the previous Save the Art models on pretty much all benchmarks. Not all of them but most of them. And it did it in a very quick time. So if we compare it to BERT, if we wanted to find the most similar sentence pair from 10,000 sentences in that 2019 paper they found that with BERT that took 65 hours. With S BERT embeddings they could create all the embeddings in just around five seconds. And then they could compare all those with cosine similarity in 0.01 seconds. So it's a lot faster. We go from 65 hours to just over five seconds which is I think pretty incredible. Now I think that's pretty much all the context we need behind sentence transformers. And what we do now is dive into a little bit of how they actually work. Now we said before we have the core transform models and what S BERT does is fine tunes on sentence pairs using what is called a Siamese architecture or Siamese network. What we mean by a Siamese network is that we have what we can see, what can view as two BERT models that are identical and the weights between those two models are tied. Now in reality when implementing this we just use a single BERT model. And what we do is we process one sentence, a sentence A through the model and then we process another sentence, sentence B through the model. And that's the sentence pair. So with our cross-linked we were processing the sentence pair together. We were putting them both together, processing them all at once. This time we process them separately. And during training what happens is the weights

from all of them, not really random sampling, just taking all the possible pairs, we end up with nine new or nine pairs in total, which is much better if you extend that a little further. So from just a thousand pairs, we can end up with one million pairs. So you can see quite quickly, you can take a small data set and very quickly create a big data set with it. Now this is just one part of the problem though, because our smaller data set will have similarity scores or natural language inference labels, but the new data set that we've just created, the augmented data set, doesn't have any of those, just randomly sampled new sentence pairs. So there's no scores or labels there and we need those to actually train and model. So what we can do is take a slightly different approach or add another step into here. Now that other set is using something called a cross encoder. So in semantic similarity, we can use two different types of models. We can use a cross encoder, which is over here, or we can use a bi-encoder or what I would usually call a sentence transporter. Now a cross encoder is the sort of old way of doing it and it works by simply putting sentence A and sentence B into a BERT model together at once. So we have sentence A, separate a token, sentence B, feed that into a BERT model and from that BERT model we will get all of our embeddings, output embeddings over here and they all get fed into a linear layer, which converts all of those into a similarity score up here. Now that similarity score is typically going to be more accurate than a similarity score that you get from a bi-encoder or a sentence transformer. But the problem here is from our sentence transformer we are outputting sentence vectors and if we have two sentence vectors we can perform a cosine similarity or

해당 질문을 이번에는 gpt4에 물어보았습니다.

query = "Which training method should I use for sentence transformers when " + "I only have pairs of related sentences?" + "answer must be korean"

res = client.chat.completions.create(
    model="gpt-4-1106-preview",
    messages=[
        {
            "role": "user",
            "content": f"{query}",
        },
    ],
    temperature=0,
    max_tokens=400,
    top_p=1,
    frequency_penalty=0,
    presence_penalty=0,
    stop=None
)
print(res.choices[0].message.content)

문장 트랜스포머(Sentence Transformers)를 훈련시킬 때 관련된 문장 쌍만 가지고 있다면, Siamese(시암) 또는 Triplet(트리플렛) 네트워크 구조를 사용하는 것이 좋습니다. 이러한 방법들은 문장 쌍 사이의 유사성을 학습하는 데 효과적입니다. 1. Siamese Network Training (시암 네트워크 훈련): Siamese 네트워크는 두 개의 동일한 서브네트워크를 사용하여 각 문장을 인코딩하고, 이 인코딩된 벡터들 사이의 거리를 최소화하는 방식으로 훈련됩니다. 관련된 문장 쌍은 가까운 벡터로, 관련 없는 문장 쌍은 먼 벡터로 매핑되도록 학습합니다. Contrastive Loss나 Triplet Loss를 사용할 수 있습니다. 2. Triplet Network Training (트리플렛 네트워크 훈련): 트리플렛 네트워크는 기준 문장(anchor), 긍정적인 문장(positive), 그리고 부정적인 문장(negative)의 세 쌍을 사용합니다. 기준 문장과 긍정적인 문장은 서로 관련이 있고, 부정적인 문장은 관련이 없습니다. 트리

'AIML' 카테고리의 다른 글

Stable Diffusion Request Body (0)	2025.01.16
OpenAI Assistants API란? 예제를 통해 쉽게 알아보기 (0)	2023.11.30

Cloud Deep Dive

Pinecone + OpenAI 사용하여 RAG패턴 구현하기

왜 RAG 패턴을 사용할까?

OpenAI, PineconeDB를 활용한 벡터 임베딩/서치

라이브러리 설치하기

OpenAI API Key설정하기

gpt3.5에 질문하기

벡터임베딩 진행하기

임베딩 데이터 로드

파인콘을 사용한 벡터 임베딩

벡터서치

답변비교하기

'AIML' 카테고리의 다른 글

티스토리툴바

Pinecone + OpenAI 사용하여 RAG패턴 구현하기

왜 RAG 패턴을 사용할까?

OpenAI, PineconeDB를 활용한 벡터 임베딩/서치

라이브러리 설치하기

OpenAI API Key설정하기

gpt3.5에 질문하기

벡터임베딩 진행하기

임베딩 데이터 로드

파인콘을 사용한 벡터 임베딩

벡터서치

답변비교하기

'AIML' 카테고리의 다른 글

관련글

티스토리툴바