Warning: Original Text in French
This text was initially written in French. It is recommended for those who can read it to refer to the original version for a more accurate understanding. The content above is an English translation of the original text here: https://dev.to/pykpyky/le-livre-le-plus-rare-est-manquant-saison-1-episode-1-104p.
In the apparent tranquility of Digitown, a shadow loomed over the National Library, ready to unveil its cultural treasures. The digitization of 325,000 rare books was about to pay tribute to the explorers of this peaceful city. However, as excitement built for the reopening of the exhibition, a twist of fate was about to cast a sinister shadow over the prestigious shelves of Digitown. The live "De Revolutionibus Magnis Data" by Gustav Kustov, one of the rarest in the world, had mysteriously disappeared.
Faced with this urgent situation, Mayor Gaia Budskott called on the renowned Kusto Detective Agency, known for solving the most challenging mysteries. A young detective, eager to prove himself, was entrusted with his first mission: to find the missing book.
The digital data of the National Library of Digitown formed a complex maze for our young detective. While each book was labeled with a unique RFID identifier, Kustov's book had vanished, leaving its identifier on the cold library floor.
Immersing himself in the intricacies of the information system, the detective began his investigation.
.execute database script <|
// Create table for the books
.create-merge table Books(rf_id:string, book_title:string, publish_date:long, author:string, language:string, number_of_pages:long, weight_gram:long)
//clear any previously ingested data if such exists
.clear table Books data
// Import data for books
// (Used data is utilizing catalogue from https://github.com/internetarchive/openlibrary )
.ingest into table Books ('https://kustodetectiveagency.blob.core.windows.net/digitown-books/books.csv.gz') with (ignoreFirstRecord=true)
// Create table for the shelves
.create-merge table Shelves (shelf:long, rf_ids:dynamic, total_weight:long)
.clear table Shelves data
// Import data for shelves
.ingest into table Shelves ('https://kustodetectiveagency.blob.core.windows.net/digitown-books/shelves.csv.gz') with (ignoreFirstRecord=true)
The digital shelves provided constant data on the books present—a monumental challenge: finding a needle in a digital data library. He began a meticulous analysis of a single shelf.
Shelves
| where shelf == "1395"
But in raw form, the data could not be exploited. He first isolated each RFID identifier.
Shelves
| where shelf == "1395"
| mv-expand rf_ids to typeof(string)
It was now possible to connect this to the specific information of each book. The weights of the books, compared to the weights recorded by the shelves, seemed consistent despite slight variations. A minor detail in the equation or a missing piece of the puzzle?
Shelves
| mv-expand rf_ids to typeof(string)
| join kind=inner (Books) on $left.rf_ids == $right.rf_id
| summarize TotalWeightBook = sum(weight_gram) by shelf
| join kind=inner (Shelves) on shelf
| project shelf, TotalWeightBook, total_weight
The idea suddenly struck the young detective. What if he precisely calculated the percentage difference between the total weight of the books per shelf and the weight recorded by the shelf itself?
Shelves
| mv-expand rf_ids to typeof(string)
| join kind=inner (Books) on $left.rf_ids == $right.rf_id
| summarize TotalWeightBook = sum(weight_gram) by shelf
| join kind=inner (Shelves) on shelf
| project shelf, TotalWeightBook, total_weight
| extend WeightVariation = round((todouble(total_weight) - TotalWeightBook) / TotalWeightBook * 100,3)
| top 10 by WeightVariation
One shelf stood out distinctly: shelf number 4,242. A thrill of excitement illuminated the detective's face, anticipating an imminent discovery, convinced that he had found the location where the precious "De Revolutionibus Magnis Data" was hiding.
The tension rose as he headed towards the target shelf. And there, among ordinary volumes, rested the long-sought book. The mystery was solved, the detective had triumphed.
The library director expressed genuine relief, thanking the young detective for his efficiency.
The city of Digitown celebrated this memorable day, and the novice detective, who had begun this investigation with a hint of apprehension, emerged as a true detective, proving that perseverance, intelligence, and data analysis were the keys to solving puzzles.
Thus concludes our story, revealing that even the most elusive mysteries can be solved with tenacity. Does the future hold more challenges for our detective? Only time will tell.