PHP Resources
Home
Books
Directories
Magazines
Non-English Sites
Online Communities
Tools
Tutorials and Articles
Web Hosting
PHP Functions
PHP News Groups *
PHP Reference
Smarty Reference
Pear Reference
PHP-GTK Reference

By submitting PHP Resources you own, or know of, you'll help us build the largest PHP Resource website on the net. Please double check that your resource doesn't already exist before you submit it!!. We thank you for helping make this a better website.









Resource Image Newest ResourcesPopular ResourcesTop Resources Resource Image
PHP Resources
PHP: get_html_translation_table - Manual

search for in the

hebrev> <fprintf
Last updated: Fri, 18 Jul 2008

view this page in

get_html_translation_table

(PHP 4, PHP 5)

get_html_translation_table — Returns the translation table used by htmlspecialchars() and htmlentities()

Description

array get_html_translation_table ([ int $table [, int $quote_style ]] )

get_html_translation_table() will return the translation table that is used internally for htmlspecialchars() and htmlentities().

Note: Special characters can be encoded in several ways. E.g. " can be encoded as &quot;, &#34; or &#x22. get_html_translation_table() returns only the most common form for them.

Parameters

table

There are two new constants (HTML_ENTITIES, HTML_SPECIALCHARS) that allow you to specify the table you want. Default value for table is HTML_SPECIALCHARS.

quote_style

Like the htmlspecialchars() and htmlentities() functions you can optionally specify the quote_style you are working with. The default is ENT_COMPAT mode. See the description of these modes in htmlspecialchars().

Return Values

Returns the translation table as an array.

Examples

Example #1 Translation Table Example

<?php
$trans 
get_html_translation_table(HTML_ENTITIES);
$str "Hallo & <Frau> & Krämer";
$encoded strtr($str$trans);
?>
The $encoded variable will now contain: "Hallo &amp; &lt;Frau&gt; &amp; Kr&auml;mer".



hebrev> <fprintf
Last updated: Fri, 18 Jul 2008
 
add a note add a note User Contributed Notes
get_html_translation_table
adolfoabegg at gmail dot com
03-Jul-2008 12:47
"rafael at phpit dot com dot br" your solution only works for the ISO-8859-1 encoding, I mean, it works but only for that encoding and that's because get_html_translation_table won't let you specify the charset... it uses the default one, that is ISO-8859-1

The solution from "olito24 at gmx dot de" does work for UTF-8, I just modified it a bit specifying the UTF-8 charset, also the $str parameter wasn't being used at all, I just renamed it to $string

Note:
Change ENT_NOQUOTES to ENT_QUOTES to convert both double and single quotes

These are the functions to encode html but tags using UTF-8 and ISO-8859-1

<?php

class Html
{

/*by olito24 at gmx dot de*/
   
function htmlButTags($string) {       
       
       
$pattern = '<([a-zA-Z0-9\. "\'_\/-=;\(\)?&#%]+)>';
       
preg_match_all ('/' . $pattern . '/', $string, $tagMatches, PREG_SET_ORDER);
       
$textMatches = preg_split ('/' . $pattern . '/', $string);
       
        foreach (
$textMatches as $key => $value) {
           
$textMatches [$key] = htmlentities ($value, ENT_NOQUOTES, 'UTF-8');
        }
       
        for (
$i = 0; $i < count ($textMatches); $i ++) {
           
$textMatches [$i] = $textMatches [$i] . $tagMatches [$i] [0];
        }
       
        return
implode ($textMatches);
       
    }

/*by "rafael at phpit dot com dot br" */
   
function htmlButTags_iso($str){
       
// Take all the html entities
       
$caracteres = get_html_translation_table(HTML_ENTITIES,ENT_NOQUOTES);
       
// Find out the "tags" entities
       
$remover = get_html_translation_table(HTML_SPECIALCHARS,ENT_NOQUOTES);
       
// Spit out the tags entities from the original table
       
$caracteres = array_diff($caracteres, $remover);
       
// Translate the string....
       
$str = strtr($str, $caracteres);
       
// And that's it!
       
return $str;
    }
   
}

?>
Liam Morland
16-Jun-2008 08:57
Here is a simple way to convert named character entities to numeric character entities:

<?php
function numeric_entities($string){
   
$mapping = array();
    foreach (
get_html_translation_table(HTML_ENTITIES, ENT_QUOTES) as $char => $entity){
       
$mapping[$entity] = '&#' . ord($char) . ';';
    }
    return
str_replace(array_keys($mapping), $mapping, $string);
}
?>
iain (duh) workingsoftware.com.au
07-Sep-2007 02:06
I wrote a quick little function for converting something like '&middot;' into '&#183;':

$to_convert = '&middot;';
$table = get_html_translation_table(HTML_ENTITIES);
$equiv = '&#'.ord(array_search($to_convert,$table)).';';
Maurizio Siliani at trident dot it
20-Jul-2007 08:43
If you have troubles (like me) getting data from ISO-8859-1 encoded forms where user copy and paste from word, this routine could be useful.
It adds to the standard get_html_translation_table the codes of the characters usually M$ Word replacs into typed text.
Otherwise those characters would never be displayed correctly in html output.

function get_html_translation_table_CP1252() {
    $trans = get_html_translation_table(HTML_ENTITIES);
    $trans[chr(130)] = '&sbquo;';    // Single Low-9 Quotation Mark
    $trans[chr(131)] = '&fnof;';    // Latin Small Letter F With Hook
    $trans[chr(132)] = '&bdquo;';    // Double Low-9 Quotation Mark
    $trans[chr(133)] = '&hellip;';    // Horizontal Ellipsis
    $trans[chr(134)] = '&dagger;';    // Dagger
    $trans[chr(135)] = '&Dagger;';    // Double Dagger
    $trans[chr(136)] = '&circ;';    // Modifier Letter Circumflex Accent
    $trans[chr(137)] = '&permil;';    // Per Mille Sign
    $trans[chr(138)] = '&Scaron;';    // Latin Capital Letter S With Caron
    $trans[chr(139)] = '&lsaquo;';    // Single Left-Pointing Angle Quotation Mark
    $trans[chr(140)] = '&OElig;    ';    // Latin Capital Ligature OE
    $trans[chr(145)] = '&lsquo;';    // Left Single Quotation Mark
    $trans[chr(146)] = '&rsquo;';    // Right Single Quotation Mark
    $trans[chr(147)] = '&ldquo;';    // Left Double Quotation Mark
    $trans[chr(148)] = '&rdquo;';    // Right Double Quotation Mark
    $trans[chr(149)] = '&bull;';    // Bullet
    $trans[chr(150)] = '&ndash;';    // En Dash
    $trans[chr(151)] = '&mdash;';    // Em Dash
    $trans[chr(152)] = '&tilde;';    // Small Tilde
    $trans[chr(153)] = '&trade;';    // Trade Mark Sign
    $trans[chr(154)] = '&scaron;';    // Latin Small Letter S With Caron
    $trans[chr(155)] = '&rsaquo;';    // Single Right-Pointing Angle Quotation Mark
    $trans[chr(156)] = '&oelig;';    // Latin Small Ligature OE
    $trans[chr(159)] = '&Yuml;';    // Latin Capital Letter Y With Diaeresis
    ksort($trans);
    return $trans;
}
yes at king22 dot com
10-Apr-2007 08:33
Searching for a fast replacement of the MS WORD special characters which are not covered by get_html_translation_table() , I think the following function might help someone

<?php
function clean_up($str){
$str = stripslashes($str);
$str = strtr($str, get_html_translation_table(HTML_ENTITIES));
$str = str_replace( array("\x82", "\x84", "\x85", "\x91", "\x92", "\x93", "\x94", "\x95", "\x96""\x97"), array("&#8218;", "&#8222;", "&#8230;", "&#8216;", "&#8217;", "&#8220;", "&#8221;", "&#8226;", "&#8211;", "&#8212;"),$str);
return
$str;
}
?>

It replaces all types of quotes (single and double), horizontal ellipsis (...), bullet, en dash and em dash.
chris
21-Feb-2007 05:49
A lot of quite common characters (or at least not rare, like oelig, euro or minus) are missing from the table unfortunately.
Here are some, if you want to make your translation table more complete and your xml data less error-prone. Not sure why some characters have 2 codes, just use one. Here goes: '&apos;'=>'&#39;', '&minus;'=>'&#45;', '&circ;'=>'&#94;', '&tilde;'=>'&#126;', '&Scaron;'=>'&#138;', '&lsaquo;'=>'&#139;', '&OElig;'=>'&#140;', '&lsquo;'=>'&#145;', '&rsquo;'=>'&#146;', '&ldquo;'=>'&#147;', '&rdquo;'=>'&#148;', '&bull;'=>'&#149;', '&ndash;'=>'&#150;', '&mdash;'=>'&#151;', '&tilde;'=>'&#152;', '&trade;'=>'&#153;', '&scaron;'=>'&#154;', '&rsaquo;'=>'&#155;', '&oelig;'=>'&#156;', '&Yuml;'=>'&#159;', '&yuml;'=>'&#255;', '&OElig;'=>'&#338;', '&oelig;'=>'&#339;', '&Scaron;'=>'&#352;', '&scaron;'=>'&#353;', '&Yuml;'=>'&#376;', '&fnof;'=>'&#402;', '&circ;'=>'&#710;', '&tilde;'=>'&#732;', '&Alpha;'=>'&#913;', '&Beta;'=>'&#914;', '&Gamma;'=>'&#915;', '&Delta;'=>'&#916;', '&Epsilon;'=>'&#917;', '&Zeta;'=>'&#918;', '&Eta;'=>'&#919;', '&Theta;'=>'&#920;', '&Iota;'=>'&#921;', '&Kappa;'=>'&#922;', '&Lambda;'=>'&#923;', '&Mu;'=>'&#924;', '&Nu;'=>'&#925;', '&Xi;'=>'&#926;', '&Omicron;'=>'&#927;', '&Pi;'=>'&#928;', '&Rho;'=>'&#929;', '&Sigma;'=>'&#931;', '&Tau;'=>'&#932;', '&Upsilon;'=>'&#933;', '&Phi;'=>'&#934;', '&Chi;'=>'&#935;', '&Psi;'=>'&#936;', '&Omega;'=>'&#937;', '&alpha;'=>'&#945;', '&beta;'=>'&#946;', '&gamma;'=>'&#947;', '&delta;'=>'&#948;', '&epsilon;'=>'&#949;', '&zeta;'=>'&#950;', '&eta;'=>'&#951;', '&theta;'=>'&#952;', '&iota;'=>'&#953;', '&kappa;'=>'&#954;', '&lambda;'=>'&#955;', '&mu;'=>'&#956;', '&nu;'=>'&#957;', '&xi;'=>'&#958;', '&omicron;'=>'&#959;', '&pi;'=>'&#960;', '&rho;'=>'&#961;', '&sigmaf;'=>'&#962;', '&sigma;'=>'&#963;', '&tau;'=>'&#964;', '&upsilon;'=>'&#965;', '&phi;'=>'&#966;', '&chi;'=>'&#967;', '&psi;'=>'&#968;', '&omega;'=>'&#969;', '&thetasym;'=>'&#977;', '&upsih;'=>'&#978;', '&piv;'=>'&#982;', '&ensp;'=>'&#8194;', '&emsp;'=>'&#8195;', '&thinsp;'=>'&#8201;', '&zwnj;'=>'&#8204;', '&zwj;'=>'&#8205;', '&lrm;'=>'&#8206;', '&rlm;'=>'&#8207;', '&ndash;'=>'&#8211;', '&mdash;'=>'&#8212;', '&lsquo;'=>'&#8216;', '&rsquo;'=>'&#8217;', '&sbquo;'=>'&#8218;', '&ldquo;'=>'&#8220;', '&rdquo;'=>'&#8221;', '&bdquo;'=>'&#8222;', '&dagger;'=>'&#8224;', '&Dagger;'=>'&#8225;', '&bull;'=>'&#8226;', '&hellip;'=>'&#8230;', '&permil;'=>'&#8240;', '&prime;'=>'&#8242;', '&Prime;'=>'&#8243;', '&lsaquo;'=>'&#8249;', '&rsaquo;'=>'&#8250;', '&oline;'=>'&#8254;', '&frasl;'=>'&#8260;', '&euro;'=>'&#8364;'
chris
21-Feb-2007 05:49
and a few more :
'&image;'=>'&#8465;', '&weierp;'=>'&#8472;', '&real;'=>'&#8476;', '&trade;'=>'&#8482;', '&alefsym;'=>'&#8501;', '&larr;'=>'&#8592;', '&uarr;'=>'&#8593;', '&rarr;'=>'&#8594;', '&darr;'=>'&#8595;', '&harr;'=>'&#8596;', '&crarr;'=>'&#8629;', '&lArr;'=>'&#8656;', '&uArr;'=>'&#8657;', '&rArr;'=>'&#8658;', '&dArr;'=>'&#8659;', '&hArr;'=>'&#8660;', '&forall;'=>'&#8704;', '&part;'=>'&#8706;', '&exist;'=>'&#8707;', '&empty;'=>'&#8709;', '&nabla;'=>'&#8711;', '&isin;'=>'&#8712;', '&notin;'=>'&#8713;', '&ni;'=>'&#8715;', '&prod;'=>'&#8719;', '&sum;'=>'&#8721;', '&minus;'=>'&#8722;', '&lowast;'=>'&#8727;', '&radic;'=>'&#8730;', '&prop;'=>'&#8733;', '&infin;'=>'&#8734;', '&ang;'=>'&#8736;', '&and;'=>'&#8743;', '&or;'=>'&#8744;', '&cap;'=>'&#8745;', '&cup;'=>'&#8746;', '&int;'=>'&#8747;', '&there4;'=>'&#8756;', '&sim;'=>'&#8764;', '&cong;'=>'&#8773;', '&asymp;'=>'&#8776;', '&ne;'=>'&#8800;', '&equiv;'=>'&#8801;', '&le;'=>'&#8804;', '&ge;'=>'&#8805;', '&sub;'=>'&#8834;', '&sup;'=>'&#8835;', '&nsub;'=>'&#8836;', '&sube;'=>'&#8838;', '&supe;'=>'&#8839;', '&oplus;'=>'&#8853;', '&otimes;'=>'&#8855;', '&perp;'=>'&#8869;', '&sdot;'=>'&#8901;', '&lceil;'=>'&#8968;', '&rceil;'=>'&#8969;', '&lfloor;'=>'&#8970;', '&rfloor;'=>'&#8971;', '&lang;'=>'&#9001;', '&rang;'=>'&#9002;', '&loz;'=>'&#9674;', '&spades;'=>'&#9824;', '&clubs;'=>'&#9827;', '&hearts;'=>'&#9829;', '&diams;'=>'&#9830;'
Jérôme Jaglale
31-Dec-2006 11:43
htmlentities includes htmlspecialchars, so here's how to convert an UTF-8 string :
htmlentities($string, ENT_QUOTES, 'UTF-8');
zohar at zohararad dot com
04-Dec-2006 06:31
Another way of converting HTML entities into numeric entities to please XML parsers is using two arrays as conversion tables in a preg_replace function. The conversion table mechanism is based on Ryan's examples above.

<?php
function xmlEntities($s){
//build first an assoc. array with the entities we want to match
$table1 = get_html_translation_table(HTML_ENTITIES, ENT_QUOTES);

//now build another assoc. array with the entities we want to replace (numeric entities)
foreach ($table1 as $k=>$v){
 
$table1[$k] = "/$v/";
 
$c = htmlentities($k,ENT_QUOTES,"UTF-8");
 
$table2[$c] = "&#".ord($k).";";
}

//now perform a replacement using preg_replace
//each matched value in array 1 will be replaced with the corresponding value in array 2
$s = preg_replace($table1,$table2,$s);
return
$s;
}
?>
trukin at gmail dot com
29-Oct-2006 11:25
There have been issues when hispanic websites or other websites dont use the corrent collision in mysql.

Some problems result that the accents (éä ... ) result in weird characters when a backup is done and restored later on. Or when database is changed to another one.

To fix this try something like this
function accents($text){
    foreach(get_html_translation_table(HTML_ENTITIES) as $a=>$b){
        $text = str_replace($a,$b,$text);   
    }
    return $text;
}

and use as accents("Hello ....... WITH ACCENTS") and it will return the escaped string.
edwardzyang at thewritingpot dot com
23-Jul-2006 07:04
Quite disappointingly, get_html_translation_table() only gives the characters for ISO-8859-1, making it quite useless for UTF-8 or anything else like that (as a previous commenter noticed).
Patrick nospam at nospam mesopia dot com
29-May-2005 07:00
Not sure what's going on here but I've run into a problem that others might face as well...

<?php

$translations
= array_flip(get_html_translation_table(HTML_ENTITIES,ENT_QUOTES));

?>

returns the single quote ' as being equal to &#39; while

<?php

$translatedString
= htmlentities($string,ENT_QUOTES);

?>
returns it as being equal to &#039;

I've had to do a specific string replacement for the time being... Not sure if it's an issue with the function or the array manipulation.

-Pat
Alex Minkoff
18-May-2005 04:30
If you want to display special HTML entities in a web browser, you can use the following code:

<?
$entities
= get_html_translation_table(HTML_ENTITIES);
foreach (
$entities as $entity) {
   
$new_entities[$entity] = htmlspecialchars($entity);
}
echo
"<pre>";
print_r($new_entities);
echo
"</pre>";
?>

If you don't, the key name of each element will appear to be the same as the element content itself, making it look mighty stupid. ;)
ryan at ryancannon dot com
26-Jan-2005 02:05
In XML, you can't assume that the doctype will include the same character entity definitions as HTML. XML authors may require character references instead. The following two functions use get_html_translation_table() to encode data in numeric references. The second, optional argument can be used to substitute a different translation table.

function xmlcharacters($string, $trans='') {
    $trans=(is_array($trans))? $trans:get_html_translation_table(HTML_ENTITIES, ENT_QUOTES);
    foreach ($trans as $k=>$v)
        $trans[$k]= "&#".ord($k).";";
    return strtr($string, $trans);
}
function xml_character_decode($string, $trans='') {
    $trans=(is_array($trans))? $trans:get_html_translation_table(HTML_ENTITIES, ENT_QUOTES);
    foreach ($trans as $k=>$v)
        $trans[$k]= "&#".ord($k).";";
    $trans=array_flip($trans);
    return strtr($string, $trans);
}
kevin_bro at hostedstuff dot com
03-Jan-2003 06:06
Alans version didn't seem to work right. If you're having the same problem consider using this slightly modified version instead:

function unhtmlentities ($string)  {
   $trans_tbl = get_html_translation_table (HTML_ENTITIES);
   $trans_tbl = array_flip ($trans_tbl);
   $ret = strtr ($string, $trans_tbl);
   return preg_replace('/&#(\d+);/me',
      "chr('\\1')",$ret);
}
alan at akbkhome dot com
03-Jun-2002 10:00
If you want to decode all those &#123; symbols as well....

function unhtmlentities ($string)  {
    $trans_tbl = get_html_translation_table (HTML_ENTITIES);
    $trans_tbl = array_flip ($trans_tbl);
    $ret = strtr ($string, $trans_tbl);
    return  preg_replace('/\&\#([0-9]+)\;/me',
        "chr('\\1')",$ret);
}
dirk at hartmann dot net
19-Jun-2001 01:41
get_html_translation_table
It works only with the first 256 Codepositions.
For Higher Positions, for Example &#1092;
(a kyrillic Letter) it shows the same.

hebrev> <fprintf
Last updated: Fri, 18 Jul 2008
 
 




Featured




Featured
PHP Code Examples
web site templates
Learn PHP playing Trivia
PHP & MySQL Forums
Web Development Index

List Your ResourceUpdate Your Resource

Copyright © 2006 - 2008 MickMel Inc