LLM June 10, 2025

LLM-Powered Localization Pipeline #4: Hygiene for Translated Records

Words count 61k Reading time 55 mins.

This article describes the final “hygiene” step in an LLM-powered localization pipeline, focusing on programmatically cleaning and validating translated records. It covers removing unnecessary characters, normalizing Unicode, converting punctuation, detecting mismatches and unwanted Chinese characters, adjusting variable spacing, and ensuring all placeholder...

Read article

LLM June 10, 2025

LLM-Powered Localization Pipeline #4: Hygiene for Translated Records

Words count 61k Reading time 55 mins.

This article describes the final “hygiene” step in an LLM-powered localization pipeline, focusing on programmatically cleaning and validating translated records. It covers removing unnecessary characters, normalizing Unicode, converting punctuation, detecting mismatches and unwanted Chinese characters, adjusting variable spacing, and ensuring all placeholders are present—ensuring translation quality before saving or patching to a TMS.

Read article

PYTHON May 31, 2023

Deploying a Python Flask Application to Azure Web App Service with a Subdirectory Configuration

Words count 18k Reading time 17 mins.

In this blog post, we will explore the process of deploying a Python Flask application to Azure Web App Service (Linux). However, instead of deploying it to the default wwwroot directory, we will deploy it to a subdirectory within the web app. This approach offers flexibility and organization, allowing you to keep your application code separate from other files in the root directory.

Read article

CRAWL DATA May 12, 2023

Cracking the Code: Mastering Redfin's Web Scraping Protection

Words count 12k Reading time 11 mins.

In this topic, we will delve into the intriguing world of web scraping protection on Redfin Houses - a well-known platform for real estate listings. The primary technique employed here involves utilizing HTTP requests with pure Python, eliminating the need for a web driver like Selenium. This approach allows us to utilize it in various environments, including Colab and non-browser-supported environments.

Read article

ALGORITHM July 14, 2022

Apply Sliding Window technique with Two Pointers

Words count 4.3k Reading time 4 mins.

Motivated by my successful solution to the “Longest Substring Without Repeating Characters” problem on Leetcode, I eagerly present the “Sliding Window” technique, a simple yet intriguing method that leverages two pointers to achieve optimal performance.

Read article

CRAWL DATA June 27, 2022

Crawl data from App Store Connect

Words count 6.8k Reading time 6 mins.

In this article, I will show you the way to get Units, the term indicates how many downloads on your own application published on App Store, following by days, months, years, using Python script

Read article

ALGORITHM September 03, 2021

Tìm trung vị của 2 dãy đã được sắp xếp

Words count 16k Reading time 14 mins.

Cho hai dãy nums1, nums2 đã được sắp xếp theo thứ tự có size là mn, hãy tìm trung vị (median) của hai dãy đó.

Đây là bài toán được xem là mức độ khó trên leetcode.com, thách thức của bài này là thuật toán phải được chạy với time O(log(m+n))

Read article
0%