Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Converting Tags to Lower Case

Status
Not open for further replies.

DonP

IS-IT--Management
Jul 20, 2000
684
US
Does anyone know of a function that will convert HTML tags into lower case? I am doing it now with a series of str_replace() functions but it's awkward at best.

Don
contact@pc-homepage.com
Experienced in HTML, Perl, PHP, VBScript, PWS, IIS and Apache and MS-Access, MS-SQL, MySQL databases
 
Do we have php which will parse html in the same way as with xml ?. You might be able to do this in a browser as well by walking the trees and getting things like outerhtml for each element, We can but try !
 
skiflyer:
$str = '
<SPAN CLASS=">" NAME="BOO">la de da</SPAN><A HREF="AsD.HTML">1</A><a HREF="aSD.html"></A>
';

worked correctly for me.

i used RegExp because its the easiest way to do it...

Known is handfull, Unknown is worldfull
 
Yeah, your method is definitely correct with the adjustments.

Just whenever users take two ways to a solution I enjoy benchmarking it and seeing how they go.
 
where do u get ur data?

i thougt RegExp was the easest way to handle data (rather that using for loop, splits etc...)

Known is handfull, Unknown is worldfull
 
You can't have a CSS class called ">" anyway.
">" is a child selector in CSS.
Can't see the point of catering for invalid code really.

But well done! Most of this is all way over my head. I just found an example in the PHP manual and it seemed to work.

 
vbkris

The benchmark data? I just set some timers and put it in a loop.

As far as easiest goes, you're probably right. Just a matter of preference and goals I guess. I certainlly wasn't trying to attack the regex solution, I hope it doesn't sound that way, in fact I think it's really good... and that is has both serious advantages and disadvantages over the looping solution (for example, if additional constraints were to be added they would probably add negligible time to the regex while adding significant time to the looping method).

Foamcow

I'm pretty sure you can have the greater than sign as part of an attribute value though, or at least as an anchor value.
 
Per vbkris' code several posts back, is $str some required name? When I substitute it for the actual variable that my code uses, I get odd results:

For example, the original:

Code:
<P><IMG class=ImageRight height=185 src="/php/show?image.php?ID=22" width=185 align=right>

becomes:

Code:
<P><IMG class=ImageRight height=185 src='22"' width=185 align=right>

where the entire path and file name are removed, rather than

Code:
<p><img class=ImageRight height=185 src="/php/show?image.php?ID=22" width=185 align=right>



Don
contact@pc-homepage.com
Experienced in HTML, Perl, PHP, VBScript, PWS, IIS and Apache and MS-Access, MS-SQL, MySQL databases
 
>>$str some required name

nope, when u replace replace all $str with $html...

yes there is a bug, let me try and fix it...

Known is handfull, Unknown is worldfull
 
corrected:
Code:
<?
$str = '
<P><IMG class="ImageRight" height=185 src="/php/show?Image.php?ID=22" width="185" align="right">
';
preg_match_all("/(\s[^\s]*=(['\"]).+\\2)/Ui", $str,$out, PREG_PATTERN_ORDER);

$str=preg_replace("/(=(['\"]).+\\2)/Ui","ToBeReplaced",$str);
$str=preg_replace("/<(.*)>/Uei","'<'.strtolower(\"$1\").'>'",$str);

for($i=0;$i<count($out[0]);$i++)
{
	$TheStr=$out[0][$i];
	$TheInsidePath=preg_replace("/^.*(=.*)$/U","\\1",$TheStr);
	$str=preg_replace("/tobereplaced/i",'#$%'.$i.$TheInsidePath,$str,1) or die($TheInsidePath);
}

$str=preg_replace("/#\\$%\d+=/i","=",$str) or die($TheInsidePath);
echo($str)
?>

Note: Some attributes are not enclosed within ' or ", they have to be enclosed if it has to be read correctly...

Known is handfull, Unknown is worldfull
 
They WERE all enclosed in quotes, but the online form-based WYSIWYG editor seems to strip them. It is the ugly HTML that this editor generates that I am trying to correct but I don't think there is a way of making it stop stripping the quotes.

Don
contact@pc-homepage.com
Experienced in HTML, Perl, PHP, VBScript, PWS, IIS and Apache and MS-Access, MS-SQL, MySQL databases
 
I notice that there seems to be some issue with certain tags. For example:

Code:
<A href="[URL unfurl="true"]http://www.domain.com/"[/URL] target="new">

becomes

Code:
<A href#$%9href="[URL unfurl="true"]http://www.domain.com/"[/URL] target="new">

I am beginning to think that maybe regular expressions aren't the way to go so maybe it's back to a series of basic str_replace() function calls in order to make it change only what it should. There is probably a rather finite list that can be entered into an array, then looped through somehow in str_replace().

Don
contact@pc-homepage.com
Experienced in HTML, Perl, PHP, VBScript, PWS, IIS and Apache and MS-Access, MS-SQL, MySQL databases
 
I already provided a non-regex approach which isn't limited to a given list, which is finite, but not exactly small.
 
thats strange don, it worked for me...

Known is handfull, Unknown is worldfull
 
The variable here is the online WYSIWYG editor that I am using, the HTML from which I am trying to clean up. I am pretty sure that it is the reason for these odd problems so I am now trying a newer (but Beta) version that seems to produce cleaner code, along with a series of str_replace() to tweek it the way I want. It meant twenty-six str_replace() calls but it works.

Don
contact@pc-homepage.com
Experienced in HTML, Perl, PHP, VBScript, PWS, IIS and Apache and MS-Access, MS-SQL, MySQL databases
 
but like skiflyer pointed it will be really dextrous...

Known is handfull, Unknown is worldfull
 
Yes, I looked at it but my site is hosted through a provider where I cannot install third party applications. The Tidy site wasn't very clear about downloading or installing anyway.

Anyway, mine is now giving me the reasonably clean HTML that I wanted. Thanks everyone for the very useful information and help.

Don
contact@pc-homepage.com
Experienced in HTML, Perl, PHP, VBScript, PWS, IIS and Apache and MS-Access, MS-SQL, MySQL databases
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top