# SQLとベクトルの結合クエリ

MyScaleは、AIデータのワークロードにおいてSQLとベクトルのパワーを組み合わせています。SQLの包括的なシンタックスライブラリを使用して、構造化(ベクトル化)および非構造化のデータに対して複雑な伝統的なSQLクエリとベクトルクエリを実行することができます。これにより、迅速かつ効率的なデータ分析操作が可能となります。

# SQL + ベクトル: なぜ重要なのか?

伝統的なSQLとベクトルの結合クエリの重要性は、構造化データとベクトル埋め込み(データ)を組み合わせて複雑なクエリを処理し、高次元データを統一的かつ効率的に分析する能力にあります。このアプローチには、以下のようないくつかの利点と機能があります。

  • さまざまなデータ型と関数を扱う: MyScaleは、高次元ベクトルデータと思いつくようなメタデータを格納およびクエリできます。このシームレスな統合により、ユーザーはより複雑なクエリとデータ分析を設計・実行することができます。AIモデルとビジネスデータを使用することで、より包括的なデータの視点を提供し、意思決定の利益をもたらします。

  • ハイブリッド検索: ハイブリッド検索では、ベクトル化されたドキュメントの意味検索と伝統的なキーワードマッチングを組み合わせることで、ベクトル化されたドキュメントの意味的なカバレッジの不足に対処します。特に、レアな単語を特定する場合や、特定の用語を検索する意図がある場合に有益です。さらに、ハイブリッド検索の基盤は、SQLとベクトルの結合クエリを通じて確立されます。

  • 統一的なクエリ言語: SQLは、多くの開発者やデータアナリストによって使用される広く使われているデータベースクエリ言語です。ユーザーは、SQLとベクトル検索を1つのクエリで結合することで、既に知っているシンタックスをシームレスに統合することができます。新しいクエリ言語を学ぶ必要がなくなります。

  • 効率的なデータの取得: ベクトル化(非構造化)および構造化データは、効率的かつ効果的なデータの取得には特殊なベクトルインデックスと検索アルゴリズムが必要とされることがあります。SQLとベクトルの結合クエリを1つのクエリで結合することで、MyScaleエンジンはベクトル関連の検索を実行し、クエリの実行計画を最適化してデータをより高速かつ効率的に取得することができます。

    TIP

    データを効果的に取得する方法の例については、高度なアプリケーション: クエリ設計を参照してください。

  • データワークフローの効率化: MyScaleは、ネイティブのSQLからベクトル検索クエリまでの統一されたデータソリューションを提供します。SQLユーザーはベクトル検索を含めたクエリを拡張することができ、ベクトル検索ユーザーはメタデータをよく構造化されたテーブルに格納することができます。

まとめると、伝統的なSQLとベクトルクエリは、高次元ベクトルデータを処理・分析することができ、データベースのアプリケーション領域を拡張し、データ分析の能力を向上させることができます。これは、機械学習、データサイエンス、画像処理など、多くの分野において重要です。

TIP

伝統的なSQLクエリとベクトル検索の堅牢さを理解するために、サンプルアプリケーション高度なアプリケーションを参照してください。

# SQL + ベクトルの使用方法: ChatDataをベースにしたデモ

ChatDataは、LLMチャットアプリであり、ドキュメントとの対話時に効率と正確性を提供します。強力なメタデータフィルターと高度なベクトル検索機能により、ドメインデータに関する質問に簡単に答えることができます。

TIP

ChatDataを使用すると、学術論文を含む数百万のドキュメントとシームレスに対話することができます。研究者、学生、知識愛好家のいずれであっても、ChatDataを使用して学術論文や研究文書を探索することはこれまで以上に簡単です。ChatDataを使用して情報検索の真のポテンシャルを引き出し、知識の世界を手に入れましょう。

以下のセクションでは、ChatDataのクエリ検索実装に基づいたMyScaleのデモを作成し、SQLとベクトルの結合クエリを実行する方法を説明します。

# データベーステーブルの作成

このサンプルアプリケーションでは、PDFドキュメントライブラリを使用して、クエリ時に格納および取得される学術論文データ(ベクトル形式)を使用します。MyScaleのSQLワークスペースで、以下のSQLステートメントを実行して、ドキュメントライブラリの関係テーブルを作成します。

# chatPDF テーブル

このテーブルは、以下のフィールドに格納されるPDFドキュメントライブラリから抽出された段落テキストデータを格納します。

  1. id: 段落のID
  2. pdf (SHA256): 対応するPDFのID
  3. content: 段落の内容
  4. section: 段落が所属するセクション
  5. page: ページ番号
  6. vector: 段落の特徴ベクトル
CREATE TABLE default.chatPDF (
    `id` String,
    `pdf` String,
    `content` String,
    `section` String,
    `page` Int32,
    `vector` Array(Float32),
    VECTOR INDEX vec_idx vector TYPE MSTG('metric_type=Cosine'),
    CONSTRAINT vec_len CHECK length(vector) = 512)
ENGINE = ReplacingMergeTree ORDER BY id SETTINGS index_granularity = 8192;

# chatPDF_meta テーブル

このテーブルは、以下のフィールドにリンクされたドキュメントライブラリのPDFファイルに関連するメタデータ情報を格納します。

  1. pdf (SHA256): PDFのID
  2. title: PDFのタイトル
  3. authors: PDFの著者
  4. abstract: PDFの要約
  5. pub_date: 公開日
  6. doi: ドキュメントのDOI(デジタルオブジェクト識別子)- https://doi.org (opens new window)
  7. publisher: ドキュメントの出版社
  8. article_type: 記事のタイプ/カテゴリ - {'arXiv', 'Finance'}
  9. vector: 著者の特徴ベクトル
CREATE TABLE default.chatPDF_meta (
    `pdf` String,
    `title` String,
    `authors` Array(String),
    `abstract` String,
    `pub_date` Nullable(Date32),
    `doi` String,
    `publisher` String,
    `article_type` String,
    `vector` Array(Float32),
    VECTOR INDEX vec_idx vector TYPE MSTG('metric_type=Cosine'),
    CONSTRAINT vec_len CHECK length(vector) = 512)
ENGINE = ReplacingMergeTree ORDER BY pdf SETTINGS index_granularity = 8192;

# chatPDF_ref テーブル

このテーブルは、以下のフィールドに引用されたドキュメントライブラリのPDFファイルに関する情報を格納します。

  1. pdf (SHA256): 引用されたPDFのID
  2. title: 引用された記事のタイトル
  3. authors: 引用された記事の著者
  4. journal: 引用された記事の要約
  5. year: 引用された記事の発行年
CREATE TABLE default.chatPDF_ref (
    `title` String,
    `pdf` String,
    `journal` String,
    `authors` Array(String),
    `year` Nullable(Date32))
ENGINE = ReplacingMergeTree ORDER BY title SETTINGS index_granularity = 8192;

# chatPDF_user テーブル

このテーブルは、ドキュメントライブラリのPDFドキュメントに対するユーザーのアクセス権限を以下のフィールドに格納します。

  1. id: ACL(アクセス制御リスト)のID
  2. user: ACLユーザー
  3. pdf: ユーザーがアクセスできるACL PDFドキュメントコード
CREATE TABLE default.chatPDF_user (
    `id` String, 
    `user` String, 
    `pdf` String) 
ENGINE = ReplacingMergeTree ORDER BY id SETTINGS index_granularity = 8192;

TIP

ユーザーが複数のPDFドキュメントへのアクセス権限を持っている場合、chatPDF_user テーブルには、そのユーザーに関連する複数の行のレコードが存在します。

# SQLとベクトルの結合クエリの実行

これらのデータベーステーブルを使用して、検索要件を満たすSQLとベクトルの結合クエリを効率的に実行する方法を説明します。

# ユースケース1: PDFファイルのコレクションから関連する論文を取得する

このシナリオでは、以下の条件を満たすchatPDF テーブルからPDFとそのタイトルを取得することを目的としています。

  • user が "default" または "test" のいずれかであること
  • authors に "Cedar" のような名前が含まれていること
  • クエリベクトルとの距離に基づいて順序付けられること(意味的な類似性を表す)

これらの条件を満たすデータを取得するには、以下のSQLステートメントを実行します:

SELECT content, section, page, title, pdf
FROM chatPDF
JOIN chatPDF_meta ON chatPDF.pdf = chatPDF_meta.pdf
WHERE pdf IN (
    SELECT pdf FROM chatPDF_user
    JOIN chatPDF_meta ON chatPDF_user.pdf = chatPDF_meta.pdf
    WHERE user IN ('default', 'test') AND match(arrayStringConcat(authors, ';'), '(?i)(;Ceder\s|\sCeder;)')
)
GROUP BY content, section, page, d, title, pdf
ORDER BY distance(chatPDF.vector, [0.026999086141586304,0.031650662422180176,0.013864615000784397,-0.053247902542352676,0.06827133148908615,-0.07262341678142548,0.05043990910053253,0.0013820948079228401,-0.008540806360542774,-0.0002308133989572525,0.06593822687864304,-0.040088851004838943,0.0065822964534163475,-0.03170497715473175,-0.008417543955147266,0.02299412712454796,-0.051368385553359985,0.008389266207814217,-0.079441137611866,0.02974756620824337,0.016738949343562126,0.06552143394947052,-0.0015636092284694314,-0.027125705033540726,0.050242915749549866,0.01254260167479515,-0.0029571610502898693,-0.10734105110168457,-0.0828493982553482,-0.026928286999464035,0.045929133892059326,0.0008282766793854535,-0.054626185446977615,0.030738720670342445,0.03874953091144562,0.008306861855089664,0.021981295198202133,0.022476540878415108,-0.0060426779091358185,0.02391696535050869,0.030219828709959984,0.04619702696800232,-0.06530780345201492,0.07061367481946945,0.05533526837825775,-0.003080346155911684,-0.02253931388258934,0.06256599724292755,-0.0776442363858223,0.01975518837571144,-0.027114108204841614,0.04596388712525368,0.02279621735215187,0.002945868531242013,-0.0015898416750133038,0.05462668836116791,-0.05107376351952553,-0.07294963300228119,-0.03645467385649681,-0.004737087991088629,-0.07179401814937592,0.007904154248535633,-0.033034853637218475,-0.039946116507053375,-0.003957211971282959,0.01979912631213665,0.03920338302850723,0.023198742419481277,-0.0038021851796656847,-0.1061340942978859,-0.07258614897727966,0.006566578987985849,-0.023369908332824707,0.028178321197628975,-0.04091496765613556,-0.026457346975803375,-0.004802894312888384,-0.031243516132235527,0.060932084918022156,0.05338958278298378,-0.026067456230521202,-0.012777717784047127,0.030694443732500076,-0.016132811084389687,0.01813514158129692,0.03305035084486008,0.028796397149562836,0.030067002400755882,-0.009653291665017605,0.027696458622813225,-0.00048651263932697475,-0.024008505046367645,0.06732442229986191,0.003604266094043851,0.05211661383509636,0.07432989031076431,0.06776303052902222,-0.042032644152641296,0.017074251547455788,0.04724365472793579,-0.0001555848866701126,0.07640675455331802,-0.035054367035627365,0.018668806180357933,0.06374965608119965,0.02925926260650158,-0.009556514210999012,0.01306259073317051,-0.036391548812389374,0.0034884687047451735,-0.04396089166402817,-0.04594021290540695,0.0015791639452800155,-0.0009103559423238039,-0.004307177383452654,0.054940417408943176,-0.05315075442194939,0.08367597311735153,0.013729246333241463,-0.014523754827678204,-0.03446390479803085,0.0708846002817154,0.06853380054235458,-0.02094944939017296,-0.05946700647473335,-0.034937113523483276,0.06845465302467346,-0.029673170298337936,-0.01334450114518404,0.02043166197836399,-0.04104345664381981,-0.04891224578022957,-0.014508051797747612,-0.050760507583618164,-0.0650150254368782,0.02871553786098957,-0.05221206694841385,0.08540071547031403,-0.04675278440117836,0.0615418404340744,-0.007765837479382753,0.028126856312155724,-0.005272290203720331,-0.04655084013938904,0.018804281949996948,0.09398901462554932,-0.004436886869370937,-0.04573441669344902,0.04369768500328064,-0.020836740732192993,0.02849348448216915,0.026714829728007317,0.04603387787938118,0.03234457969665527,-0.0032120163086801767,-0.00720333494246006,0.0237945057451725,0.05746796354651451,0.10711528360843658,-4.9354974180459976e-05,-0.07700524479150772,0.015967318788170815,0.008455992676317692,0.05179617926478386,0.04583102464675903,-0.034291986376047134,0.04876098781824112,0.05868189036846161,-0.00832947064191103,-0.008326420560479164,0.0743461474776268,-0.03880206495523453,0.11520592123270035,0.03293044492602348,0.05199345201253891,-0.029571393504738808,-0.051465485244989395,-0.014250459149479866,-0.015538709238171577,-0.05069621652364731,-0.004608175717294216,0.022481797263026237,-0.01984306238591671,-0.006090227980166674,-0.034103717654943466,0.04855476692318916,-0.02658236399292946,-0.0038069162983447313,-0.04663381725549698,-0.011598984710872173,-0.01801265776157379,0.04226880893111229,0.0677245631814003,-0.017390063032507896,-0.0019030668772757053,0.10509350150823593,-0.021937070414423943,-0.009788216091692448,0.03744519129395485,0.001477006240747869,-0.017919331789016724,-0.026724155992269516,0.039193011820316315,-0.024180298671126366,-0.06452587991952896,-0.0035215539392083883,0.005589423701167107,0.05701106786727905,0.049469269812107086,0.00446392223238945,-0.0261820238083601,-0.01770012080669403,0.058718081563711166,-0.0048060487024486065,-0.006102082319557667,0.005824038293212652,0.052870433777570724,0.010728297755122185,0.035200126469135284,-0.07551974803209305,-0.051468122750520706,-0.007006988860666752,0.004533618222922087,-0.028034832328557968,-0.006710531190037727,-0.028786277398467064,0.05626194551587105,0.010266546159982681,0.031754035502672195,0.03694238141179085,0.025284932926297188,-0.03397466614842415,-0.011790627613663673,0.04270120710134506,0.0014999251579865813,-0.024694599211215973,-0.025497794151306152,0.06993667036294937,0.03870249539613724,0.00010579009540379047,0.03504638001322746,0.025572344660758972,-0.004730174783617258,0.04920973256230354,-0.07639332860708237,-0.02675638534128666,-0.10298098623752594,0.0317358560860157,-0.030814286321401596,-0.04177673161029816,-0.0030244854278862476,0.001605249010026455,-0.04100067913532257,0.011919977143406868,0.013938448391854763,-0.014999412931501865,-0.018616221845149994,0.0035378083121031523,-0.0003974463033955544,0.03007170371711254,0.033498045057058334,-0.014186926186084747,-0.06717824190855026,-0.05676146596670151,-0.002536509186029434,-0.012216432020068169,-0.03274456039071083,-0.07016073912382126,0.027928130701184273,-0.006023918278515339,-0.014584201388061047,0.02828258089721203,0.01605045609176159,0.013603655621409416,-0.03165196627378464,0.02176736667752266,0.015294879674911499,0.03909007087349892,-0.03221159055829048,-0.02703136019408703,-0.015480686910450459,-0.06110067665576935,-0.0406932570040226,0.010359122417867184,-0.032171376049518585,-0.05194094404578209,0.02039448358118534,0.03945254161953926,0.0063950917683541775,-0.07735629379749298,0.013408828526735306,-0.04082614183425903,0.004937610123306513,-0.013532319106161594,0.02098943293094635,0.046174533665180206,0.05722230672836304,0.03884165361523628,-0.028747886419296265,-0.027442697435617447,0.020229991525411606,0.04540535807609558,-0.028022319078445435,0.04311973229050636,0.012806595303118229,0.04239150136709213,0.02941069006919861,-0.006360919214785099,0.015810860320925713,0.04363419860601425,-0.02934212051331997,0.010519737377762794,0.03338617458939552,-0.03972255066037178,0.03444833308458328,0.11895830929279327,-0.001475695869885385,0.09342533349990845,0.01004725694656372,0.00521648395806551,0.009570184163749218,0.037115368992090225,-0.014191587455570698,-0.029996715486049652,0.05852231755852699,0.003670441685244441,-0.03111087717115879,-0.007686744909733534,-0.00878136232495308,0.015304391272366047,-0.00835162028670311,0.014680332504212856,-0.06027398630976677,-0.044898197054862976,-0.031150352209806442,0.04225290194153786,-0.05963743478059769,-0.049651917070150375,0.0015495388070121408,0.0460234172642231,-0.09391812980175018,0.0565929152071476,-0.014345213770866394,-0.006359162740409374,0.014748381450772285,0.05111211538314819,0.004764666315168142,-0.0023365227971225977,-0.03408080339431763,-0.034600626677274704,-0.05124702304601669,-0.017525795847177505,-0.004975790157914162,-0.023927779868245125,-0.06676872819662094,-0.028322583064436913,0.013797154650092125,-0.01374990213662386,-0.035827670246362686,-0.026726802811026573,-0.0036591251846402884,-0.04383886978030205,-0.039545685052871704,-0.021230548620224,-0.0028415226843208075,-0.011543807573616505,-0.015424227342009544,-0.011197403073310852,0.04764008894562721,0.0227516982704401,-0.004535396117717028,0.0235084667801857,0.006367370020598173,-0.012380723841488361,0.06436987966299057,-0.006682234350591898,-0.010924633592367172,0.002654495881870389,0.04149286076426506,0.00090180360712111,0.06268277764320374,-0.04713604599237442,-0.01889045350253582,0.009414365515112877,0.04259977862238884,0.03397039324045181,0.02849150076508522,-0.027262089774012566,-0.0218980610370636,0.003047373378649354,0.023192720487713814,-0.016982559114694595,-0.020740406587719917,0.008828762918710709,-0.0809066966176033,-0.084974005818367,-0.022028392180800438,-0.002516932552680373,-0.042008355259895325,-0.021305542439222336,-0.04327180236577988,0.005354521330446005,0.0004744822799693793,-0.05184450000524521,-0.08057332038879395,-0.025101900100708008,0.018032586202025414,0.008285390213131905,-0.01446616742759943,-0.04705870524048805,0.09355530887842178,-0.003802527906373143,-0.02881501242518425,-0.0532359704375267,-0.02586442232131958,0.05022677779197693,-0.0856320932507515,-0.020890625193715096,0.028313474729657173,-0.05678088963031769,-0.016457537189126015,0.010364792309701443,0.0092974454164505,0.011288127861917019,0.008098184131085873,0.01191268302500248,-0.026048999279737473,-0.0014207729836925864,-0.025333549827337265,-0.01913163997232914,-0.021867400035262108,-0.03406573086977005,0.04469537362456322,-0.028908485546708107,0.010367023758590221,0.031788188964128494,0.014998797327280045,0.042763058096170425,-0.03708334639668465,0.01386157888919115,-0.08847704529762268,-0.07643768936395645,-0.002707232255488634,0.03341418504714966,-0.028400259092450142,-0.0018234935123473406,0.008151134476065636,0.049622874706983566,0.018184566870331764,-0.007438826374709606,-0.06772993505001068,0.0016479900805279613,0.016313103958964348,-0.01838725246489048,-0.002810437697917223,-0.008299310691654682,0.0021590180695056915,-0.07658176869153976,-0.052365776151418686,-0.0046624112874269485,-0.019321279600262642,-0.05388221889734268,-0.006629584822803736,0.10689274966716766,0.05511053279042244,-0.022639039903879166,0.035971011966466904,0.08532299101352692,-0.029289506375789642,-0.03065607137978077,0.028966661542654037,-0.02942272648215294,-0.03957923874258995,0.03224382549524307,0.018050478771328926,0.032970454543828964,0.05030900239944458,-0.041838958859443665,0.024229686707258224,0.057461854070425034,-0.02769048698246479,-0.08287783712148666,-0.013447163626551628,-0.02488953061401844,-0.0425758957862854,-0.012551549822092056,-0.04819975793361664,0.008644652552902699,0.022290777415037155,-0.011933647096157074,0.009241824969649315,0.062280043959617615,0.04031931608915329,0.001838891999796033,0.014451868832111359,0.05482003092765808,-0.008294470608234406,-0.023700769990682602,0.03858468681573868,-0.023084208369255066,0.034360695630311966,0.028466977179050446,0.007224178873002529,0.029080070555210114,-0.03933968394994736,0.03940221667289734,0.03889504820108414,0.04564525932073593,0.0435170978307724,-0.04239299148321152,-0.01216538343578577,-0.004489195998758078]) AS d
LIMIT 5;

# ユースケース2: 提供された埋め込みベクトルに関連するセクションを検索する

以下のように、このシナリオのSQLステートメントでは、埋め込みベクトルを使用して以下の条件を満たすセクションを検索します。

  • これらのセクションはクエリの埋め込みベクトルに関連している必要があります
  • user が "default" または "test" のいずれかであること
  • クエリ結果は、分解されたドキュメントのキーワードに基づいて再順序付けされ、ベクトルとキーワードのパワーを組み合わせてより正確な検索が行われます。
SELECT content, section, page, temp_t.title AS title, temp_t.pdf AS pdf 
FROM (
    SELECT content, section, page, title, pdf
    FROM chatPDF 
    JOIN chatPDF_meta ON chatPDF_meta.pdf = chatPDF.pdf 
    WHERE pdf IN (
        SELECT pdf FROM chatPDF_user
        JOIN chatPDF_meta ON chatPDF_user.pdf = chatPDF_meta.pdf
        WHERE user IN ('test', 'default')
        )
    GROUP BY content, section, page, d, title, pdf
    ORDER BY distance(chatPDF.vector, [-0.028602387756109238,0.05193805322051048,-0.031317807734012604,-0.028051884844899178,0.09304523468017578,-0.012233047746121883,0.013670934364199638,-0.058627404272556305,0.016324086114764214,0.024056825786828995,-0.007964950986206532,-0.015832604840397835,0.0465020053088665,0.05120447650551796,-0.011173561215400696,-0.008055370301008224,-0.06142625957727432,0.03517622500658035,-0.058551665395498276,-0.01308200042694807,-0.007963517680764198,0.03235742449760437,0.025468328967690468,-0.026939349249005318,0.06110360845923424,0.01686902903020382,-0.04319953918457031,0.09188768267631531,-0.022290648892521858,0.055751387029886246,-0.013135429471731186,0.008576076477766037,0.059156663715839386,0.041759055107831955,0.02573470026254654,-0.056330692023038864,-0.018969712778925896,-0.0617007240653038,-0.023090023547410965,-0.016790034249424934,0.01975618675351143,-0.020167330279946327,-0.02554931864142418,-0.0015003907028585672,0.007852326147258282,0.0541803352534771,-0.007227180525660515,-0.03331175819039345,0.0055015478283166885,0.02638649381697178,0.01572638750076294,0.0554802380502224,0.0860990360379219,-0.009892038069665432,0.0006325682043097913,-0.00466573890298605,-0.07239983975887299,-0.054198142141103745,-0.0315401628613472,-0.01702083647251129,-0.028342323377728462,0.0047926888801157475,0.011463815346360207,-0.024098187685012817,0.027457818388938904,0.02334539219737053,-0.009983752854168415,-0.0633431002497673,-0.06647104769945145,-0.06046867370605469,-0.07855461537837982,0.03323746845126152,-0.12926268577575684,0.014088059775531292,-0.031514447182416916,-0.025199050083756447,0.005245922598987818,-0.029317660257220268,0.04884430393576622,0.012645414099097252,-0.01832539215683937,-0.0046740262769162655,0.055797673761844635,0.0046990374103188515,0.027943402528762817,0.055492889136075974,-0.012013912200927734,0.018530962988734245,-0.020475702360272408,-0.03987526893615723,-0.009115397930145264,-0.0242095235735178,0.07148326933383942,0.002836561528965831,0.011758368462324142,0.010632219724357128,0.07038570195436478,-0.030432313680648804,0.002112780697643757,-0.003160921623930335,-0.06508484482765198,-0.010381905362010002,-0.023127656430006027,0.01196929533034563,0.010409880429506302,-0.08278219401836395,0.03462051600217819,-0.003167665097862482,-0.08831408619880676,-0.013079256750643253,-0.0013668457977473736,-0.034809526056051254,-0.030371399596333504,-0.031348817050457,-0.02590388059616089,-0.021449672058224678,0.03770831599831581,0.08849338442087173,0.010925468988716602,-0.009033727459609509,-0.020753415301442146,-0.01770271360874176,-0.017846258357167244,-0.0035869968123733997,-0.021489502862095833,0.045604851096868515,-0.013902788050472736,-0.0384070985019207,-0.03368276357650757,0.04123971238732338,0.0018785643624141812,0.06104699522256851,-0.03108287788927555,-0.0037539778277277946,-0.03130106255412102,-0.024152100086212158,-0.02726845070719719,0.004191273357719183,-0.05925445258617401,0.0742005705833435,-0.06772784888744354,-0.006988621316850185,-0.03191212937235832,0.02903582714498043,0.01989048160612583,0.09483784437179565,-0.04168536886572838,0.025414111092686653,0.03742283955216408,0.05055548623204231,0.0026791745331138372,0.03570197895169258,-0.0012597993481904268,0.009046895429491997,0.030189141631126404,-0.04691040888428688,-0.0009548828820697963,0.01596323400735855,0.07529480755329132,0.08409800380468369,-0.06255559623241425,-0.005675397347658873,-0.039521511644124985,0.052339985966682434,0.03254899010062218,0.015361281111836433,0.058702316135168076,0.009377162903547287,0.08463673293590546,0.011505217291414738,0.05275140702724457,0.019933659583330154,0.009439929388463497,-0.007302085869014263,-0.0043915980495512486,-0.03271979093551636,0.017273275181651115,-0.01740737073123455,0.018746431916952133,-0.00824034120887518,0.07812781631946564,-0.0384545624256134,-0.0424121618270874,0.07000716775655746,0.011353518813848495,0.04983452707529068,-0.015417062677443027,-0.004711403977125883,-0.04050220176577568,-0.0546519011259079,-0.010305394418537617,0.04194115102291107,0.008575938642024994,0.021970907226204872,0.053986597806215286,0.03271383047103882,-0.009459379129111767,-0.02329881116747856,0.0792340561747551,0.03156689554452896,-0.07657361775636673,-0.01969241164624691,0.07950250059366226,0.02582576498389244,0.03393377736210823,0.03178403154015541,-0.04218177869915962,0.01007103081792593,-0.007927320897579193,-0.027992310002446175,0.0805586576461792,-0.02275105006992817,0.0019749384373426437,0.020015651360154152,-0.02570357732474804,-0.024672288447618484,0.049746956676244736,0.019401296973228455,-0.014245196245610714,0.02062692493200302,-0.031854745000600815,-0.0027649407275021076,-0.0020269006490707397,-0.038350436836481094,0.0015580937033519149,-0.015934618189930916,0.011462862603366375,-0.02152520976960659,-0.022145027294754982,0.00734312366694212,0.03794105723500252,-0.03129032254219055,-0.010872636921703815,0.013028818182647228,0.013146125711500645,-0.001660099602304399,0.00029317382723093033,0.018063941970467567,0.009409281425178051,-0.016657380387187004,0.026886843144893646,-0.010345596820116043,0.0278768390417099,0.010111561045050621,-0.036829929798841476,-0.04345128312706947,-0.08976473659276962,0.04493587464094162,0.028010740876197815,-0.03416436165571213,0.023581236600875854,0.02721460536122322,0.02870713174343109,0.0486646443605423,0.03549840673804283,-0.055571019649505615,-0.020705128088593483,0.019123468548059464,-0.024551449343562126,0.0696345865726471,0.02460857294499874,-0.057880476117134094,-0.022359278053045273,-0.034049905836582184,-0.06536795198917389,-0.03331594914197922,-0.003080692607909441,0.027616700157523155,-0.04753870889544487,0.016564052551984787,-0.060747019946575165,-0.033646680414676666,0.03772412985563278,0.0007252924842759967,-0.04008457437157631,-0.007883877493441105,0.03680732473731041,-0.009088178165256977,0.025094570592045784,0.0027452725917100906,0.007728055119514465,-0.006499788723886013,0.03094450943171978,0.03458185866475105,0.006452739238739014,-0.060986150056123734,-0.023363333195447922,0.0022187596186995506,0.06034814938902855,0.008435722440481186,0.0015066927298903465,-0.012030326761305332,-0.04852339252829552,0.09226252138614655,0.019318904727697372,0.019926346838474274,0.031246166676282883,-0.015020517632365227,0.005295108072459698,-0.05487625300884247,0.01943211816251278,-0.0065069072879850864,-0.005782572086900473,0.03127134591341019,0.036789581179618835,-0.0021005230955779552,-0.006595696322619915,0.09886176139116287,0.044642094522714615,0.008776228874921799,-0.06613229960203171,-0.09778912365436554,0.04855702817440033,-0.019823754206299782,0.05024749040603638,0.05649179592728615,-0.060693010687828064,0.05163635313510895,-0.010757084004580975,0.06527445465326309,0.011550993658602238,-0.0006965601351112127,0.019896207377314568,-0.018262844532728195,0.06802519410848618,-0.01982499286532402,-0.013756493106484413,-0.01080138050019741,-0.0007239197148010135,0.0019289625342935324,0.013325190171599388,-0.039237406104803085,0.07574980705976486,0.011901366524398327,0.01575469598174095,-0.016088636592030525,-0.04486628249287605,0.04137108847498894,-0.014687441289424896,-0.04507390037178993,-0.08581030368804932,-0.007651266176253557,-0.02012077160179615,0.030385226011276245,-0.027815770357847214,0.055400848388671875,-0.015418179333209991,0.00169078866019845,-0.008691568858921528,0.02365257404744625,0.013552607037127018,0.018333865329623222,0.029841255396604538,-0.018621621653437614,-0.010539400391280651,0.008913171477615833,-0.002017852384597063,-0.0025873517151921988,-0.05953807011246681,0.00825919397175312,0.034610796719789505,-0.016277987509965897,-0.006498720962554216,-0.024549007415771484,-0.011766843497753143,0.018465928733348846,0.0034142436925321817,0.008967841044068336,-0.008319002576172352,-0.03201327845454216,-0.026722850278019905,-0.01684730313718319,-0.009930165484547615,-0.00018909515347331762,-0.010824637487530708,0.00852691289037466,0.026722438633441925,0.039784424006938934,0.011125900782644749,-0.006381566170603037,-0.014115390367805958,0.00244533852674067,0.05720775946974754,-0.028283527120947838,0.0358721986413002,0.047634437680244446,0.08459443598985672,0.02213694341480732,-0.017319083213806152,-0.05709897726774216,0.07799524068832397,-0.018437424674630165,0.036587849259376526,-0.01116589643061161,-0.03918002173304558,0.012297233566641808,-0.05485652759671211,0.0007847997476346791,-0.03045291267335415,0.026051536202430725,-0.004062032327055931,-0.005013444926589727,-0.04919704422354698,-0.05441347137093544,-0.02954081818461418,-0.029185811057686806,-0.03374828025698662,-0.02701537497341633,-0.02051454409956932,-0.01822960563004017,0.023751920089125633,-0.005654079373925924,-0.06160213053226471,-0.046936020255088806,0.021633923053741455,-0.02495487593114376,-0.0015039177378639579,-0.0051044500432908535,-0.022187741473317146,-0.034528106451034546,0.03416862338781357,0.032588254660367966,0.01145924348384142,-0.04730336368083954,-0.0029300148598849773,0.03436218574643135,-0.0261283740401268,-0.03162160515785217,-0.05312611535191536,0.005359994247555733,0.029726408421993256,0.03818337619304657,0.018482066690921783,0.00817812792956829,0.006548972800374031,-0.0578656941652298,0.014239479787647724,0.029399987310171127,0.002749588806182146,-0.02178371138870716,-0.07573916018009186,-0.021304957568645477,0.04327695444226265,-0.008249565027654171,0.016941362991929054,0.02107137255370617,0.024265944957733154,-0.006902314256876707,0.039750199764966965,0.014925206080079079,-0.08287150412797928,-0.04848889261484146,0.01919567957520485,0.02647663839161396,0.03438766300678253,-0.05986958369612694,-0.03709588199853897,-0.11452722549438477,-0.036673933267593384,-0.008246892131865025,-0.07221311330795288,-0.0735122561454773,-0.02042395807802677,0.018186723813414574,0.002861814806237817,-0.005009822081774473,-0.07307695597410202,-0.03611520677804947,0.026176802814006805,0.009307770989835262,0.0312464889138937,-0.024423643946647644,-0.004277748521417379,0.006169045809656382,0.07744313776493073,-0.030859921127557755,0.0008700981270521879,0.00243563624098897,-0.07085950672626495,-0.05762483924627304,-0.011360583826899529,-0.030241122469305992,0.015516492538154125,0.020868854597210884,0.02939927577972412,-0.030372491106390953,-0.06916476786136627,0.010707120411098003,-0.02807154878973961,-0.008762502111494541,-0.0034118774347007275,0.027162129059433937,-0.0616542249917984,0.03941693902015686,0.013875571079552174,0.049216143786907196,0.041300106793642044,-0.08846136182546616,0.056695155799388885,-0.03506137430667877,-0.021880896762013435,-0.018869969993829727,0.007139734458178282,0.023009536787867546,-0.010022856295108795,-0.025198865681886673,0.015322377905249596,0.04815468192100525,0.02825315110385418,-0.0006994997384026647,-0.03493428975343704,0.006141179241240025]) AS d
    LIMIT 20
) temp_t
JOIN chatPDF_meta ON temp_t.pdf = chatPDF_meta.pdf
ORDER BY log(1 + countMatches(arrayStringConcat([abstract, content, title], ' '), '(?i)(solid-state|electrolyte)')+1) AS d DESC
LIMIT 5;

# ノート

以下のポイントは、SQLとベクトルの結合クエリにMyScaleを使用する場合に適用されます。

  1. ベクトルインデックスを持つテーブルがサブクエリに関与している場合、MyScaleはベクトルインデックスを使用してクエリを高速化するため、distance()...ORDER BY...LIMIT クエリパターンを使用することができます。ClickHouseでは、ORDER BY 句は外側のクエリだけでなく、サブクエリ内でも使用することができます。

  2. MyScaleはカラム指向のデータベースなので、クエリの結果には必要なカラムのみが含まれるようにすることが理想的です。必要のないカラムはSELECT句に表示されず、SELECT *はクエリの速度を向上させるために避けるべきです。

Last Updated: Sat Apr 13 2024 10:45:55 GMT+0000