In this tutorial, we will delve into Python’s robust standard library and powerful data processing capabilities, providing various methods to extract integers from a text file. We’ll explore different techniques, leveraging Python’s built-in functionalities and some handy libraries. By the end of this guide, you’ll clearly understand how to extract integers from any text file, setting a foundation for more advanced text manipulation and data extraction tasks.
Text file:
Text.txt What is the sum of 130,125,191? If we minus 712 from 1500, how much do we get? 50 times of 8 is equal to: 110 divided by 10 is: 5: 20 +( 90 ÷ 2) is equal to:
Method 1: Using Regular Expression
import re with open('Text.txt', 'r') as file: all_Text = file.read() numbers = re.findall(r'\d+', all_Text) num = [int(number) for number in numbers] print("Numbers :" + str(num))
Break-down of the Code:
- The code utilizes Python’s ‘re’ module to find all sequences of digits in a given text. The regex pattern r’\d+’ matches one or more consecutive digits, effectively identifying all numeric values within the text.
- After extracting the numeric values as strings, the code converts them into integers using a list comprehension. This step ensures that the numbers can be used for numerical operations, such as arithmetic calculations or comparisons.
Output:
Numbers :[130, 125, 191, 712, 1500, 50, 8, 110, 10, 5, 20, 90, 2]
Method 2: Using ‘isdigit()’ function
with open('Text.txt', 'r') as file: all_Text = file.read() numbers = [] i = 0 while i < len(all_Text): if all_Text[i].isdigit(): start = i while i < len(all_Text) and all_Text[i].isdigit(): i += 1 numbers.append(int(all_Text[start:i])) else: i += 1 print("Numbers :",numbers)
Break-down of the code:
- The code iterates through a list of characters (‘all_Text’) and extracts contiguous sequences of digits, converting them into integers. These integers are then appended to the ‘numbers’
- The ‘isdigit()’ function is used to check if each character in ‘all_Text’ is a digit, enabling the code to identify and group contiguous sequences of digits for extraction and conversion into integers.
Output:
Numbers : [130, 125, 191, 712, 1500, 50, 8, 110, 10, 5, 20, 90, 2]
Note: You can also use the ‘isnumeric()’ function instead of ‘isdigit()’ to handle a wider range of numeric characters, including certain Unicode representations.