Asked  7 Months ago    Answers:  5   Viewed   42 times

If I have the following values:

 $var1 = AR3,373.31

 $var2 = 12.322,11T

How can I create a new variable and set it to a copy of the data that has any non-numeric characters removed, with the exception of commas and periods? The values above would return the following results:

 $var1_copy = 3,373.31

 $var2_copy = 12.322,11

 Answers

16

You could use preg_replace to swap out all non-numeric characters and the comma and period/full stop as follows:

$testString = '12.322,11T';
echo preg_replace('/[^0-9,.]+/', '', $testString);

The pattern can also be expressed as /[^d,.]+/

Wednesday, March 31, 2021
 
Muazam
answered 7 Months ago
46

A regex would be simplest:

$input = 'foo_left.jpg';
if(!preg_match('/_(left|right|center)/', $input, $matches)) {
    // no match
}

$pos = $matches[0]; // "_left", "_right" or "_center"

See it in action.

Update:

For a more defensive-minded approach (if there might be multiple instances of "_left" and friends in the filename), you can consider adding to the regex.

This will match only if the l/r/c is followed by a dot:

preg_match('/(_(left|right|center))./', $input, $matches);

This will match only if the l/r/c is followed by the last dot in the filename (which practically means that the base name ends with the l/r/c specification):

preg_match('/(_(left|right|center))\.[^\.]*$/', $input, $matches);

And so on.

If using these regexes, you will find the result in $matches[1] instead of $matches[0].

Wednesday, March 31, 2021
 
braindamage
answered 7 Months ago
61

This can't work properly. Stored with Unicode there are many more Characters than with ANSI. So if you "convert" to ANSI, you will loose lots of charackters.

http://php.net/manual/en/function.htmlentities.php

You can use Unicode (UTF-8) charset with htmlentities:

string htmlentities ( string $string [, int $flags = ENT_COMPAT [, string $charset [, bool $double_encode = true ]]] )

htmlentities($myString, ENT_COMPAT, "UTF-8"); should work.

Thursday, August 5, 2021
 
CoderGuy123
answered 3 Months ago
96
> 'worth $12,345.00 dollars'.replace(/[^0-9$.,]/g, '')
"$12,345.00"

This is the answer you asked for. I would not recommend it for extracting currencies, since it can suffer from problems like this:

> 'A set of 12 worth between $123 and $456. A good buy.'.replace(/[^0-9$.,]/g, '')
"12$123$456.."

If you want to just extract expressions of a currency-like form, you could do:

> 'set of 12 worth between $123.00 and $45,678'.match(/$[0-9,]+(?:.dd)?/g)
["$123.00", "$45,678"]

If you need more complicated matching (e.g. you'd just like to extract the dollar value and ignore the cent value) you could do something like How do you access the matched groups in a JavaScript regular expression? for example:

> var regex = /$([0-9,]+)(?:.(dd))?/g;
> while (true) {
>     var match = regex.exec('set of 12 worth between $123.00 and $45,678');
>     if (match === null)
>         break;
>     console.log(match);
> }
["$123.00", "123", "00"]
["$45,678", "45,678", undefined]

(Thus be careful, javascript regexp objects are not immutable/final objects, but have state and can be used for iteration as demonstrated above. You thus cannot "reuse" a regexp object. Even passing myRegex2 = RegExp(myRegex) will mix state; a very poor language decision for the constructor. See the addendum on how to properly clone regexes in javascript.) You can rewrite the above as a very exotic for-loop if you'd like:

var myString = 'set of 12 worth between $123.00 and $45,678';
var regex = '$([0-9,]+)(?:.(dd))?';

for(var match, r=RegExp(regex,'g'); match=regex.exec(myString) && match!==null; )
    console.log(match);

addendum - Why you can't reuse javascript RegExp objects

Bad language design, demonstrating how state is reused:

var r=/(x.)/g
var r2 = RegExp(r)

r.exec('xa xb xc')
["xa", "xa"]
r2.exec('x1 x2 x3')
["x2", "x2"]

How to properly clone a regex in javascript (you have to define it with a string):

var regexTemplate = '(x.)'

var r = RegExp(regexTemplate, 'g')
var r2 = RegExp(regexTemplate, 'g')

r.exec('xa xb xc')
["xa", "xa"]
r2.exec('x1 x2 x3')
["x1", "x1"]

If you wish to programmatically preserve flags such as 'g', you can probably use regexTemplate = ['(x.)', 'g']; RegExp.apply(this, regexTemplate).

Wednesday, August 11, 2021
 
JackTheKnife
answered 3 Months ago
25

Here is a good case for regular expressions. You can run a find and replace on the data either before you import (easier) or later on if the SQL import accepted those characters (not nearly as easy). But in either case, you have any number of methods to do a find and replace, be it editors, scripting languages, GUI programs, etc. Remember that you're going to want to find and replace all of the bad characters.

A typical regular expression to find the comma and quotes (assuming just double quotes) is: (Blacklist)

/[,"]/

Or, if you find something might change in the future, this regular expression, matches anything except a number or decimal point. (Whitelist)

/[^0-9.]/

What has been discussed by the people above is that we don't know all of the data in your CSV file. It sounds like you want to remove the commas and quotes from all of the numbers in the CSV file. But because we don't know what else is in the CSV file we want to make sure that we don't corrupt other data. Just blindly doing a find/replace could affect other portions of the file.

Tuesday, September 14, 2021
 
Kasun Sandaruwan
answered 1 Month ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :