Asked  7 Months ago    Answers:  5   Viewed   50 times

Using php, I'm trying to create a script which will search within a text file and grab that entire line and echo it.

I have a text file (.txt) titled "numorder.txt" and within that text file, there are several lines of data, with new lines coming in every 5 minutes (using cron job). The data looks similar to:

2 aullah1
7 name
12 username

How would I go about creating a php script which will search for the data "aullah1" and then grab the entire line and echo it? (Once echoed, it should display "2 aullah1" (without quotations).

If I didn't explain anything clearly and/or you'd like me to explain in more detail, please comment.

 Answers

91

And a PHP example, multiple matching lines will be displayed:

<?php
$file = 'somefile.txt';
$searchfor = 'name';

// the following line prevents the browser from parsing this as HTML.
header('Content-Type: text/plain');

// get the file contents, assuming the file to be readable (and exist)
$contents = file_get_contents($file);
// escape special characters in the query
$pattern = preg_quote($searchfor, '/');
// finalise the regular expression, matching the whole line
$pattern = "/^.*$pattern.*$/m";
// search, and store all matching occurences in $matches
if(preg_match_all($pattern, $contents, $matches)){
   echo "Found matches:n";
   echo implode("n", $matches[0]);
}
else{
   echo "No matches found";
}
Wednesday, March 31, 2021
 
Daveel
answered 7 Months ago
58

Fortunately, text extraction from pdf's is a subject that has been covered multiple times. On the command line, you could use pdftotext (available on Linux or Mac) or in your code a library as Apache Tika (for which you can find a PHP wrapper).

To avoid having too much noise in your records, I'd recommend you to then split the text and create one record per paragraph. You can then use Algolia's distinct feature to deduplicate the results.

You should already have the links to your files somewhere, just store them in your records and then, in your front-end you'll easily be able to create links to them using for instance autocomplete.js or instantsearch.js .

Saturday, May 29, 2021
 
cusejuice
answered 5 Months ago
93

You can use SplFileObject and iterate over the file line by line.

Please note that iterating over the file is much more memory efficient than using file() or file_get_contents() because it does not read the entire file content into an array or a variable.

$username = 'AULLAH1';
$file = new SplFileObject("data.csv");
$grabbed = FALSE;
while (!$file->eof()) {
     $data = $file->fgetcsv(' ');
     if($data[0] === $username) {
         $grabbed = $file->current();
     }
}
echo $grabbed;

Because fgetcsv() takes into account delimiters, enclosures and escape characters when parsing the line, it is not exactly fast though. If you have to parse a couple hundred or thousand lines this way, make sure you actually have the need for it.


An alternative would be to just check if the current line contains the username string somewhere. In the file format you show in the question, this would be feasible because the remaining fields contain digits, so there cannot be any false positives:

$username = 'AULLAH1';
$file = new SplFileObject("data.txt");
$grabbed = FALSE;
foreach($file as $line) {
    if(strpos($line, $username) !== FALSE) {
        $grabbed = $line;
    }
}
echo $grabbed;

If you want to make sure the $username was found at position 1, change the if condition to test for === 1.


If, for some reason, you want to have all lines where the username occurs, you can write a custom FilterIterator to iterate over the file contents, e.g.

class UsernameFilter extends FilterIterator
{
    protected $_username;
    public function __construct(Iterator $iterator, $username)
    {
        $this->_username = $username;
        parent::__construct($iterator);
    }
    public function accept()
    {
        return strpos($this->current(), $this->_username) !== FALSE;
    }
}

Then you can simply use foreach. The FilterIterator will pass each line to accept() and only those lines for which it returns TRUE are actually used.

$filteredLines = new UsernameFilter(new SplFileObject('data.txt'), 'AULLAH1');
foreach($filteredLines as $line) {
    echo $line;
}

The above would output

"AULLAH1" "01/07/2010 15:28 " "55621454" "123456" "123456.00"
"AULLAH1" "07/07/2010 15:05 " "55621454" "189450" "123456.00"

If you want these lines in an array, you can do

$lines = iterator_to_array($filteredLines);

and to look at the last item

echo end($lines);
Saturday, May 29, 2021
 
innovation
answered 5 Months ago
65

Iterate through all the lines (StreamReader, File.ReadAllLines, etc.) and check if line.Contains("December") (replace "December" with the user input).

Edit: I would go with the StreamReader in case you have large files. And use the IndexOf-Example from @Matias Cicero instead of contains for case insensitive.

Console.Write("Keyword: ");
var keyword = Console.ReadLine() ?? "";
using (var sr = new StreamReader("")) {
    while (!sr.EndOfStream) {
        var line = sr.ReadLine();
        if (String.IsNullOrEmpty(line)) continue;
        if (line.IndexOf(keyword, StringComparison.CurrentCultureIgnoreCase) >= 0) {
            Console.WriteLine(line);
        }
    }
}
Wednesday, August 11, 2021
 
Kevin
answered 3 Months ago
36
@echo off
setlocal EnableDelayedExpansion

REM INITIALIZE THE LIST OF WORDS THAT WILL BE SEARCHED
set targetWords=:EOF

rem I'd like to search for a list of words from an external list (simple each word on a line)
for /F %%a in (List.txt) do (   
   rem and search for them in a file (C:UsesP DittyDocumentsSH3datacfgBackups_SCR*.clg) 
   findstr "%%a" "C:UsesP DittyDocumentsSH3datacfgBackups_SCR*.clg" > NUL
   rem if they are there...
   if !errorlevel! equ 0 (
      REM INSERT THE WORD IN THE TARGET LIST
      set targetWords=!targetWords! %%a
   )
)

REM INSERT THE END-OF-FILE MARK IN THE FILE
echo :EOF>> Campaign_SCR.mis.tmp

REM INITIALIZE THE NUMBER OF LAST PROCESSED LINE IN REDIRECTED Campaign_SCR.mis.tmp
set lastLine=0

rem ... find those words on another file(Campaign_SCR.mis.tmp)
< Campaign_SCR.mis.tmp (for /F "delims=:" %%a in ('findstr /N "%targetWords%" Campaign_SCR.mis.tmp') do (
   REM DUPLICATE PREVIOUS LINES UNTIL NEW TARGET LINE
   set /A numOfLines=%%a-lastLine-1
   for /L %%i in (1,1,!numOfLines!) do (
      set line=
      set /P line=
      echo(!line!
   )
   rem if the line starts with "Name="
   set /P line=
   if "!line:~0,5!" equ "Name=" (
      rem replacing the whole line...with Name=ShipDummy
      echo Name=ShipDummy
      rem After that the two lines below that in that same file would be replaced with 2nd line "Class=ShipDummy", 
      set /P line=
      echo Class=ShipDummy
      rem then 3rd line "Type=206".
      set /P line=
      echo Type=206
      set /A lastLine=%%i+2
   ) else (
      REM DUPLICATE THE NON MATCHING LINE, IF IS NOT THE :EOF MARK
      if "!line!" neq ":EOF" (
         echo !line!
         set lastLine=%%i
      )
   )
)) > Campaign_SCR.mis.tmp.NEW

REM UPDATE THE NEW FILE
REM del Campaign_SCR.mis.tmp
REM ren Campaign_SCR.mis.tmp.NEW Campaign_SCR.mis.tmp
Tuesday, August 31, 2021
 
Sunny Shah
answered 2 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :