If you experience any difficulty in accessing content on our website, please contact us at 1-866-333-8917 or email us at support@chicagovps.net and we will make every effort to assist you.

By
August 21, 2024

Mastering Substring Manipulations in Linux: A Comprehensive Guide

 

Linux systems provide a variety of methods to extract, replace, and manipulate strings of text when working with large data sets, cleaning files, or fetching strings for further processing. This article discusses several commands that can simplify these tasks beyond expectations.

It is crucial to determine whether the aim is to extract an entire word or a specific sequence of characters. The selection of commands will vary accordingly.

An uncomplicated method to retrieve a word from text based on its position, such as the third word, involves using the awk command. For instance, here is how to extract the third word from a sentence using awk:

Here, $3 signifies the third word because by default awk uses the space character as the delimiter.

To apply the cut command with a specific delimiter, the -d option is used, and the third field is extracted using -f3, as shown:

You can extract multiple fields simultaneously using the cut command, highlighted in the following demonstrations:

To employ an alternative delimiter, such as a colon, you use the following command:

Using awk, it’s feasible to specify multiple delimiters. Here, both a colon and a space are used as delimiters in the demonstrated command, which illustrates how awk parses the fields based on these delimiters.

To extract a specific sequence or characters from a string, the awk command can be employed as illustrated below. Here, $0 indicates the complete string while 10 marks the starting character position and 5 defines the length of the substring to be extracted.

Similarly, to achieve this with the cut command, you would initiate a command such as the following, in which characters from the 13th to the 22nd position are selected and displayed.

In this subsequent instance, using the cut command extracts the 7th to 12th characters from lines within a file. Additionally, the head command restricts the output to only the first 4 lines.

The grep command is also versatile for selecting specific words from a file. In this scenario, only the chosen words are shown, not the full lines, using the -o (display only the matched items) option.

Without the -o option, you would see the complete lines.

You can also select multiple-word phrases as shown in this example:

The expr command can also be used to select a portion of a phrase by specifying its starting position and length.

The sed command provides a very convenient way to replace words in a string.

You can also use this kind of command to replace multiple words as in this example:

To eliminate leading and trailing spaces in phrases, apply the xargs command.

The xargs command further helps in removing blank lines and tabs. As demonstrated, a file with two lines filled with only tabs and spaces, alongside another line that begins with four spaces and concludes with a phrase, is edited to retain only the phrase.

Utilizing bash parameter expansion allows for the demarcation of the start and endpoint of text extraction. One can create a variable by assigning it a string and then employ the demonstrated syntax to extract a specific segment of it.

Note that the example above makes it clear that this technique starts position numbering at 0. So, in the next example, the 7 represents the eighth character in the string and the -2 means to drop the last 2 characters. As a result, the substring in the first example below has a single character and the second has all but the last two.

In this next example, we first create a variable using “set –” and then use echo to display the eighth and ninth characters. In other words, it starts with the eighth character (7) and then displays two characters.

NOTE: You could display the string created with the set command by simply using the command “echo $1”. This is what is referenced by the “1” in the example above.

Linux provides many commands to help you manipulate text. The awk, cut, grep, expr, sed and xargs commands along with bash parameter expansion provide you with many useful options.


ChicagoVPS is your gateway to unparalleled hosting solutions. Our state-of-the-art datacenters and powerful network ensures lightning-fast speeds and uninterrupted connectivity for your websites and applications. Whether you’re a startup looking for scalable resources or an enterprise in need of enterprise-grade hosting, our range of plans and customizable solutions guarantee a perfect fit. Trust in ChicagoVPS to deliver excellence, combining unmatched reliability and top-tier support.

For Inquiries or to receive a personalized quote, please reach out to us through our contact form here or email us at sales@chicagovps.net.

Subscribe Email

Top