niaj hnub qhib qhov chaw cov ntaub ntawv pawg rau blockchain

1.Qhov kev sib tw rau niaj hnub blockchain cov ntaub ntawv pawg

Muaj ntau qhov kev sib tw uas niaj hnub blockchain indexing pib yuav ntsib, suav nrog:

  • Cov ntaub ntawv loj heev. Raws li tus nqi ntawm cov ntaub ntawv nyob rau hauv blockchain nce, cov ntaub ntawv Performance index yuav tsum tau scale nce los tuav lub load nce thiab muab kev nkag tau zoo rau cov ntaub ntawv. Yog li ntawd, nws ua rau cov nqi khaws cia ntau dua, kev ntsuas qeeb qeeb, thiab nce kev thauj khoom ntawm lub server server.
  • Complex data processing pipeline. Blockchain thev naus laus zis yog qhov nyuaj, thiab tsim kom muaj kev qhia dav thiab txhim khu kev qha cov ntaub ntawv yuav tsum muaj kev nkag siab tob txog cov ntaub ntawv hauv qab thiab cov txheej txheem algorithms. Kev sib txawv ntawm kev siv blockchain tau txais nws. Muab piv txwv tshwj xeeb, NFTs hauv Ethereum feem ntau yog tsim nyob rau hauv cov ntawv cog lus ntse tom qab ERC721 thiab ERC1155 hom. Hauv qhov sib piv, qhov kev siv ntawm cov ntawm Polkadot, piv txwv li, feem ntau yog tsim ncaj qha hauv blockchain runtime. Cov no yuav tsum raug suav hais tias yog NFTs thiab yuav tsum tau txais kev cawmdim raws li cov.
  • Integration peev xwm. Txhawm rau muab tus nqi siab tshaj plaws rau cov neeg siv, kev daws teeb meem blockchain indexing yuav tsum tau muab nws cov ntaub ntawv Performance index nrog rau lwm lub tshuab, xws li kev txheeb xyuas platforms lossis APIs. Qhov no yog qhov nyuaj thiab yuav tsum muaj kev siv zog tseem ceeb muab tso rau hauv kev tsim architecture.

Raws li blockchain thev naus laus zis tau dhau los ua ntau dua, cov ntaub ntawv khaws cia ntawm blockchain tau nce ntxiv. Qhov no yog vim ntau tus neeg siv thev naus laus zis, thiab txhua qhov kev hloov pauv ntxiv cov ntaub ntawv tshiab rau blockchain. Tsis tas li ntawd, blockchain thev naus laus zis tau hloov zuj zus los ntawm cov ntawv thov hloov pauv nyiaj yooj yim, xws li cov uas cuam tshuam nrog kev siv Bitcoin, mus rau ntau cov ntawv thov uas cuam tshuam nrog kev siv cov kev lag luam logic hauv cov ntawv cog lus ntse. Cov ntawv cog lus ntse no tuaj yeem tsim cov ntaub ntawv loj, ua rau muaj qhov nyuaj thiab qhov loj ntawm blockchain. Thaum lub sij hawm, qhov no tau coj mus rau ib tug loj thiab complex blockchain.

Hauv tsab xov xwm no, peb tshuaj xyuas qhov kev hloov pauv ntawm Footprint Analytics' thev naus laus zis architecture nyob rau theem raws li cov ntaub ntawv tshawb fawb los tshawb xyuas seb Iceberg-Trino thev naus laus zis li cas daws cov teeb meem ntawm cov ntaub ntawv ntawm cov saw hlau.

Footprint Analytics tau indexed txog 22 pej xeem blockchain cov ntaub ntawv, thiab 17 NFT kev lag luam, 1900 GameFi project, thiab tshaj 100,000 NFT sau rau hauv ib tug semantic abstraction cov ntaub ntawv txheej. Nws yog qhov kev tshaj lij tshaj plaws blockchain cov ntaub ntawv khaws cia hauv lub ntiaj teb.

Tsis hais txog cov ntaub ntawv blockchain, uas suav nrog ntau dua 20 billions kab ntawm cov ntaub ntawv ntawm kev lag luam nyiaj txiag, uas cov ntaub ntawv kws tshuaj xyuas nquag nug. nws txawv ntawm ingression cav nyob rau hauv ib txwm cov ntaub ntawv warehouses.

Peb tau ntsib 3 qhov kev hloov kho loj hauv ob peb lub hlis dhau los kom tau raws li kev lag luam loj hlob tuaj:

2. Architecture 1.0 Bigquery

Thaum pib ntawm Footprint Analytics, peb siv Google Bigquery raws li peb cia thiab nug cav; Bigquery yog cov khoom lag luam zoo. Nws yog blazingly ceev, yooj yim rau siv, thiab muab dynamic lej lej thiab hloov tau UDF syntax uas pab peb sai sai tau txoj hauj lwm.

Txawm li cas los xij, Bigquery kuj muaj ntau yam teeb meem.

  • Cov ntaub ntawv tsis yog compressed, ua rau cov nqi siab, tshwj xeeb tshaj yog thaum khaws cov ntaub ntawv nyoos ntawm ntau dua 22 blockchains ntawm Footprint Analytics.
  • Tsis txaus concurrency: Bigquery tsuas yog txhawb nqa 100 cov lus nug ib txhij, uas tsis tsim nyog rau cov xwm txheej siab sib xws rau Footprint Analytics thaum ua haujlwm rau ntau tus kws tshuaj ntsuam thiab cov neeg siv.
  • Xauv nrog Google Bigquery, uas yog qhov khoom kaw.

Yog li ntawd, peb txiav txim siab los tshawb nrhiav lwm yam kev tsim vaj tse.

3. Architecture 2.0 OLAP

Peb tau txaus siab rau qee yam ntawm OLAP cov khoom lag luam uas tau dhau los ua neeg nyiam heev. Qhov txiaj ntsig zoo tshaj plaws ntawm OLAP yog nws lub sijhawm teb cov lus nug, uas feem ntau siv sijhawm ob peb feeb kom rov qab cov lus nug tau rau cov ntaub ntawv loj heev, thiab nws tuaj yeem txhawb nqa ntau txhiab cov lus nug.

Peb tau xaiv ib qhov zoo tshaj plaws OLAP databases, Doris, muab nws sim. Lub cav no ua haujlwm zoo. Txawm li cas los xij, ntawm qee lub sijhawm peb tsis ntev tau khiav mus rau qee qhov teeb meem:

  • Cov ntaub ntawv hom xws li Array lossis JSON tseem tsis tau txais kev txhawb nqa (Kaum Ib Hlis, 2022). Arrays yog ib hom ntawm cov ntaub ntawv nyob rau hauv ib co blockchains. Piv txwv li, lub lub ntsiab lus teb hauv evm. Tsis tuaj yeem suav ntawm Array ncaj qha cuam tshuam rau peb lub peev xwm los suav ntau cov kev ntsuas kev lag luam.
  • Kev txhawb nqa tsis pub dhau DBT, thiab rau kev sib sau ua ke. Cov no yog cov uas yuav tsum tau muaj rau cov ntaub ntawv engineers rau ETL / ELT scenarios qhov twg peb yuav tsum tau hloov tshiab ib co tshiab indexed cov ntaub ntawv.

Uas tau hais tias, peb tsis tuaj yeem siv Doris rau peb cov ntaub ntawv tag nrho ntawm kev tsim khoom, yog li peb tau sim siv Doris ua OLAP database los daws ib feem ntawm peb cov teeb meem hauv cov ntaub ntawv cov kav dej, ua raws li cov lus nug cav thiab muab ceev thiab siab heev. concurrent query peev.

Hmoov tsis zoo, peb tsis tuaj yeem hloov Bigquery nrog Doris, yog li peb yuav tsum ua ntu zus synchronize cov ntaub ntawv los ntawm Bigquery mus rau Doris siv nws ua lub cav nug. Cov txheej txheem synchronization no muaj ntau yam teeb meem, ib qho ntawm cov ntawv hloov tshiab tau sau sai sai thaum lub cav OLAP tsis khoom rau cov lus nug rau cov neeg siv khoom hauv ntej. Tom qab ntawd, qhov ceev ntawm cov txheej txheem sau ntawv tau cuam tshuam, thiab synchronization siv sijhawm ntev dua thiab qee zaum txawm tias ua tsis tiav.

Peb pom tau hais tias OLAP tuaj yeem daws tau ntau yam teeb meem uas peb tab tom ntsib thiab tsis tuaj yeem dhau los ua kev daws teeb meem ntawm Footprint Analytics, tshwj xeeb tshaj yog rau cov ntaub ntawv ua cov kav dej. Peb qhov teeb meem loj dua thiab nyuaj, thiab peb tuaj yeem hais OLAP raws li kev nug lub cav ib leeg tsis txaus rau peb.

4. Architecture 3.0 Iceberg + Trino

Zoo siab txais tos rau Footprint Analytics architecture 3.0, ua tiav kev kho dua tshiab ntawm cov qauv hauv qab. Peb tau rov tsim dua tag nrho cov qauv tsim los ntawm hauv av mus rau cais cov cia, suav thiab nug cov ntaub ntawv rau hauv peb qhov sib txawv. Kawm cov lus qhia los ntawm ob qhov ua ntej dhau los ntawm Footprint Analytics thiab kawm los ntawm kev paub ntawm lwm cov ntaub ntawv loj loj xws li Uber, Netflix, thiab Databricks.

4.1. Taw qhia ntawm cov ntaub ntawv pas dej

Peb thawj zaug tig peb lub siab rau cov ntaub ntawv pas dej, ib hom tshiab ntawm cov ntaub ntawv khaws cia rau ob qho tib si cov ntaub ntawv tsim thiab tsis muaj qauv. Cov ntaub ntawv pas dej yog zoo meej rau on-chain cov ntaub ntawv cia raws li cov hom ntawm on-chain cov ntaub ntawv ntau yam dav los ntawm unstructured raw cov ntaub ntawv mus rau structured abstraction cov ntaub ntawv Footprint Analytics yog paub zoo rau. Peb cia siab tias yuav siv cov ntaub ntawv pas dej los daws cov teeb meem ntawm cov ntaub ntawv khaws cia, thiab qhov zoo tshaj plaws nws tseem yuav txhawb nqa cov tshuab suav nruab nrab xws li Spark thiab Flink, yog li ntawd nws yuav tsis mob siab rau kev sib koom ua ke nrog ntau hom kev ua tshuab raws li Footprint Analytics hloov zuj zus. .

Iceberg koom ua ke zoo heev nrog Spark, Flink, Trino thiab lwm yam kev suav cav, thiab peb tuaj yeem xaiv qhov kev suav tsim nyog tshaj plaws rau txhua qhov ntawm peb qhov ntsuas. Piv txwv li:

  • Rau cov uas xav tau kev ua lej nyuaj, Spark yuav yog qhov kev xaiv.
  • Flink rau kev xam lub sijhawm.
  • Rau cov haujlwm yooj yim ETL uas tuaj yeem ua tau siv SQL, peb siv Trino.

4.2. Nug cav

Nrog Iceberg daws cov teeb meem khaws cia thiab kev suav, peb yuav tsum xav txog kev xaiv lub cav nug. Tsis muaj ntau txoj kev xaiv muaj. Cov kev xaiv uas peb xav tau yog

Qhov tseem ceeb tshaj plaws uas peb tau txiav txim siab ua ntej yuav nkag mus tob dua yog tias lub cav nug yav tom ntej yuav tsum tau sib haum nrog peb cov architecture tam sim no.

  • Txhawb nqa Bigquery raws li Cov Ntaub Ntawv Qhov Chaw
  • Txhawm rau txhawb DBT, uas peb vam khom rau ntau qhov kev ntsuas los tsim
  • Txhawm rau txhawb BI cuab yeej metabase

Raws li cov saum toj no, peb xaiv Trino, uas muaj kev txhawb nqa zoo heev rau Iceberg thiab pab pawg tau teb tias peb tau tsa cov kab laum, uas tau kho rau hnub tom qab thiab tso tawm mus rau qhov tseeb version tom qab lub lim tiam. Qhov no yog qhov kev xaiv zoo tshaj plaws rau pab pawg Footprint, uas tseem xav tau kev ua haujlwm siab ua haujlwm.

4.3. Kev ntsuas kev ua haujlwm

Thaum peb tau txiav txim siab ntawm peb cov kev taw qhia, peb tau ua qhov kev sim ua haujlwm ntawm Trino + Iceberg ua ke kom pom tias nws tuaj yeem ua tau raws li peb xav tau thiab ua rau peb xav tsis thoob, cov lus nug tau nrawm heev.

Paub tias Presto + Hive tau ua qhov sib piv tsis zoo tshaj plaws rau xyoo hauv tag nrho cov OLAP hype, kev sib xyaw ntawm Trino + Iceberg ua rau peb lub siab.

Nov yog cov txiaj ntsig ntawm peb qhov kev xeem.

Case 1: koom nrog cov ntaub ntawv loj

Ib qho 800 GB table1 koom nrog lwm 50 GB table2 thiab ua cov kev suav ua lag luam nyuaj

case2: siv ib lub rooj loj los ua cov lus nug sib txawv

Kuaj sql: xaiv qhov txawv (chaw nyob) los ntawm pawg lus los ntawm hnub

Lub Trino + Iceberg ua ke yog kwv yees li 3 npaug sai dua Doris hauv tib lub teeb tsa.

Tsis tas li ntawd, muaj lwm qhov xav tsis thoob vim tias Iceberg tuaj yeem siv cov ntaub ntawv tawm tswv yim xws li Parquet, ORC, thiab lwm yam, uas yuav compress thiab khaws cov ntaub ntawv. Iceberg lub rooj cia siv tsuas yog li 1/5 ntawm qhov chaw ntawm lwm cov ntaub ntawv warehouses Qhov loj me ntawm tib lub rooj nyob rau hauv peb lub databases yog raws li nram no:

Nco tseg: Cov kev ntsuam xyuas saum toj no yog cov piv txwv uas peb tau ntsib hauv kev tsim khoom tiag tiag thiab tsuas yog siv rau kev siv xwb.

4.4. Txhim kho cov nyhuv

Cov ntawv ceeb toom kev ua tau zoo tau muab peb cov kev ua tau zoo txaus uas nws tau coj peb pab neeg li 2 lub hlis los ua kom tiav qhov kev tsiv teb tsaws chaw, thiab qhov no yog daim duab peb lub tsev tsim kho tom qab kev txhim kho.

  • Ntau lub tshuab computer sib xws nrog peb cov kev xav tau ntau yam.
  • Trino txhawb nqa DBT, thiab tuaj yeem nug Iceberg ncaj qha, yog li peb tsis tas yuav cuam tshuam nrog cov ntaub ntawv synchronization.
  • Qhov kev ua tau zoo ntawm Trino + Iceberg tso cai rau peb qhib tag nrho cov ntaub ntawv Bronze (cov ntaub ntawv nyoos) rau peb cov neeg siv.

5. Ntsiab lus

Txij li thaum nws pib thaum Lub Yim Hli 2021, pab pawg Footprint Analytics tau ua tiav peb qhov kev hloov kho vaj tsev nyob hauv tsawg dua ib xyoos thiab ib nrab, ua tsaug rau nws lub siab xav thiab kev txiav txim siab los coj cov txiaj ntsig ntawm cov cuab yeej siv zoo tshaj plaws rau nws cov neeg siv crypto thiab kev ua tiav ntawm kev siv thiab txhim kho nws lub hauv paus infrastructure thiab architecture.

Lub Footprint Analytics architecture hloov kho 3.0 tau yuav qhov kev paub tshiab rau nws cov neeg siv, tso cai rau cov neeg siv los ntawm cov keeb kwm sib txawv kom tau txais kev nkag siab hauv ntau hom kev siv thiab kev siv:

  • Ua nrog Metabase BI cov cuab yeej, Footprint pab cov kws tshuaj ntsuam kom nkag mus rau cov ntaub ntawv txiav tawm ntawm cov saw hlau, tshawb xyuas nrog kev ywj pheej ntawm kev xaiv cov cuab yeej (tsis muaj-code lossis hardcord), nug tag nrho cov keeb kwm, thiab cov ntaub ntawv hla kev soj ntsuam, kom tau txais kev nkag siab hauv tsis muaj sijhawm.
  • Kev sib koom ua ke ntawm ob qho tib si ntawm cov saw hlau thiab cov ntaub ntawv tawm mus rau kev txheeb xyuas thoob plaws web2 + web3;
  • Los ntawm lub tsev / cov lus nug metrics nyob rau sab saum toj ntawm Footprint lub lag luam abstraction, cov kws tshuaj ntsuam lossis cov neeg tsim khoom txuag lub sijhawm ntawm 80% ntawm cov ntaub ntawv rov ua haujlwm thiab tsom mus rau cov ntsiab lus metrics, kev tshawb fawb, thiab cov khoom daws teeb meem raws li lawv txoj kev lag luam.
  • Seamless kev paub los ntawm Footprint Web rau REST API hu, tag nrho raws li SQL
  • Cov lus ceeb toom ntawm lub sijhawm tiag tiag thiab cov ntawv ceeb toom ua tiav ntawm cov cim tseem ceeb los txhawb kev txiav txim siab peev

Tau qhov twg los: https://cryptoslate.com/iceberg-spark-trino-a-modern-open-source-data-stack-for-blockchain/