Extract all the integers from a text file using Python

In this tutorial, we will delve into Python’s robust standard library and powerful data processing capabilities, providing various methods to extract integers from a text file. We’ll explore different techniques, leveraging Python’s built-in functionalities and some handy libraries. By the end of this guide, you’ll clearly understand how to extract integers from any text file, setting a foundation for more advanced text manipulation and data extraction tasks.

Text file:

Text.txt

What is the sum of 130,125,191?
If we minus 712 from 1500, how much do we get?
50 times of 8 is equal to:
110 divided by 10 is:
5: 20 +( 90 ÷ 2) is equal to:

Method 1: Using Regular Expression

import re
with open('Text.txt', 'r') as file:
    all_Text = file.read()
numbers = re.findall(r'\d+', all_Text)
num = [int(number) for number in numbers]
print("Numbers :" + str(num))

Break-down of the Code:

  • The code utilizes Python’s ‘re’ module to find all sequences of digits in a given text. The regex pattern r’\d+’ matches one or more consecutive digits, effectively identifying all numeric values within the text.
  • After extracting the numeric values as strings, the code converts them into integers using a list comprehension. This step ensures that the numbers can be used for numerical operations, such as arithmetic calculations or comparisons.

Output:

Numbers :[130, 125, 191, 712, 1500, 50, 8, 110, 10, 5, 20, 90, 2]

Method 2: Using ‘isdigit()’ function

with open('Text.txt', 'r') as file:
    all_Text = file.read()
numbers = []
i = 0
while i < len(all_Text):
    if all_Text[i].isdigit():
        start = i
        while i < len(all_Text) and all_Text[i].isdigit():
            i += 1
        numbers.append(int(all_Text[start:i]))
    else:
        i += 1
print("Numbers :",numbers)

Break-down of the code:

  • The code iterates through a list of characters (‘all_Text’) and extracts contiguous sequences of digits, converting them into integers. These integers are then appended to the ‘numbers’
  • The ‘isdigit()’ function is used to check if each character in ‘all_Text’ is a digit, enabling the code to identify and group contiguous sequences of digits for extraction and conversion into integers.

Output:

Numbers : [130, 125, 191, 712, 1500, 50, 8, 110, 10, 5, 20, 90, 2]

Note: You can also use the ‘isnumeric()’ function instead of ‘isdigit()’ to handle a wider range of numeric characters, including certain Unicode representations.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top