LLM-Powered Localization Pipeline #4: Hygiene for Translated Records
Words count 61k Reading time 55 mins.
This article describes the final “hygiene” step in an LLM-powered localization pipeline, focusing on programmatically cleaning and validating translated records. It covers removing unnecessary characters, normalizing Unicode, converting punctuation, detecting mismatches and unwanted Chinese characters, adjusting variable spacing, and ensuring all placeholders are present—ensuring translation quality before saving or patching to a TMS.
Deploying a Python Flask Application to Azure Web App Service with a Subdirectory Configuration
Words count 18k Reading time 17 mins.
In this blog post, we will explore the process of deploying a Python Flask application to Azure Web App Service (Linux). However, instead of deploying it to the default wwwroot
directory, we will deploy it to a subdirectory within the web app. This approach offers flexibility and organization, allowing you to keep your application code separate from other files in the root directory.
Cracking the Code: Mastering Redfin's Web Scraping Protection
Words count 12k Reading time 11 mins.
In this topic, we will delve into the intriguing world of web scraping protection on Redfin Houses - a well-known platform for real estate listings. The primary technique employed here involves utilizing HTTP requests with pure Python, eliminating the need for a web driver like Selenium. This approach allows us to utilize it in various environments, including Colab and non-browser-supported environments.
Apply Sliding Window technique with Two Pointers
Words count 4.3k Reading time 4 mins.
Motivated by my successful solution to the “Longest Substring Without Repeating Characters” problem on Leetcode, I eagerly present the “Sliding Window” technique, a simple yet intriguing method that leverages two pointers to achieve optimal performance.
Crawl data from App Store Connect
Words count 6.8k Reading time 6 mins.
In this article, I will show you the way to get Units, the term indicates how many downloads on your own application published on App Store, following by days, months, years, using Python script
Tìm trung vị của 2 dãy đã được sắp xếp
Words count 16k Reading time 14 mins.
Cho hai dãy nums1
, nums2
đã được sắp xếp theo thứ tự có size là m
và n
, hãy tìm trung vị (median) của hai dãy đó.
Đây là bài toán được xem là mức độ khó trên leetcode.com, thách thức của bài này là thuật toán phải được chạy với time O(log(m+n))