<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: PHP 6 and Request Decoding</title>
	<atom:link href="http://zmievski.org/2007/02/php-6-and-request-decoding/feed" rel="self" type="application/rss+xml" />
	<link>http://gravitonic.com/2007/02/php-6-and-request-decoding</link>
	<description>Life, technology, and other good things</description>
	<lastBuildDate>Tue, 24 Aug 2010 14:52:58 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Rob</title>
		<link>http://zmievski.org/2007/02/php-6-and-request-decoding/comment-page-1#comment-626</link>
		<dc:creator>Rob</dc:creator>
		<pubDate>Sun, 27 Jan 2008 11:02:23 +0000</pubDate>
		<guid isPermaLink="false">http://www.gravitonic.com/n/?p=360#comment-626</guid>
		<description>Hello,
I use IIS and I have decoding problems too..
when I click on URL UTF-8 encoded, resulting querystring replace any non english chars with question marks. Why? I usa UTF-8 encoding in my pages !!! Thank you
</description>
		<content:encoded><![CDATA[<p>Hello,<br />
I use IIS and I have decoding problems too..<br />
when I click on URL UTF-8 encoded, resulting querystring replace any non english chars with question marks. Why? I usa UTF-8 encoding in my pages !!! Thank you</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Andrei</title>
		<link>http://zmievski.org/2007/02/php-6-and-request-decoding/comment-page-1#comment-625</link>
		<dc:creator>Andrei</dc:creator>
		<pubDate>Thu, 22 Feb 2007 16:16:08 +0000</pubDate>
		<guid isPermaLink="false">http://www.gravitonic.com/n/?p=360#comment-625</guid>
		<description>@Basil: If the conversion happened prior to script execution, we would have to stop at the first error and set some global error flag, since we cannot meaningfully continue. With the new approach, the conversions trigger the normal PHP error handler, which does not necessarily stop. Since the error handler may be a custom user one, there are some potential problems, depending on what the user error handler does. But like I said, this is a manageable issue.

@Andrew: If there are conversion problems from raw data to $_* variables, the conversion error handler will be invoked. The application may set a custom error handler, if so desired. If you want to receive data with offending characters dropped, you can set the global error mode in your .ini file to U_CONV_ERROR_SKIP. You will be able to get at the raw data directly. We have not considered providing santizing functions like you described, but if we do, they will probably be part of ext/filter.
</description>
		<content:encoded><![CDATA[<p>@Basil: If the conversion happened prior to script execution, we would have to stop at the first error and set some global error flag, since we cannot meaningfully continue. With the new approach, the conversions trigger the normal PHP error handler, which does not necessarily stop. Since the error handler may be a custom user one, there are some potential problems, depending on what the user error handler does. But like I said, this is a manageable issue.</p>
<p>@Andrew: If there are conversion problems from raw data to $_* variables, the conversion error handler will be invoked. The application may set a custom error handler, if so desired. If you want to receive data with offending characters dropped, you can set the global error mode in your .ini file to U_CONV_ERROR_SKIP. You will be able to get at the raw data directly. We have not considered providing santizing functions like you described, but if we do, they will probably be part of ext/filter.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Andrew Magruder</title>
		<link>http://zmievski.org/2007/02/php-6-and-request-decoding/comment-page-1#comment-624</link>
		<dc:creator>Andrew Magruder</dc:creator>
		<pubDate>Thu, 22 Feb 2007 14:18:08 +0000</pubDate>
		<guid isPermaLink="false">http://www.gravitonic.com/n/?p=360#comment-624</guid>
		<description>We&#039;re experiencing this problem with PHP5.x now.  What I don&#039;t understand from your proposal is how  PHP handles errors and how to address those errors at the application level.

I think delaying it until usage time is a good thing.  I think allowing the application to specify an encoding is a good thing.

But...

What happens when the conversion from raw to $_ variables fails because of an invalid character encoding?

How does the application find out about it?

Will the application (please!) get the converted data with offending characters dropped (rather than substituted for a question mark)?

Are you planning on providing some way to get at the raw data directly? (So we can build what we need if it won&#039;t be part of base-PHP.)

Is there any provision for the PHP application developer to write a mapping function to replace well known problematic characters @ JIT-translation time?  For example, we are routinely POSTed latin1 encoded Euro symbols in otherwise UTF-8 encoded POSTs.

If suitable diagnostics are not provided by base-level PHP, the only application recourse I see, if you want to *sure* that you&#039;re doing the right thing, is rip $_ to every encoding your application might expect to deal with (about 4-5 for us) and then do string comparisons to look for missing data.  Did I miss something?  (I looked @ php-dev list for a bit and didn&#039;t find the answers to these questions.)
</description>
		<content:encoded><![CDATA[<p>We&#8217;re experiencing this problem with PHP5.x now.  What I don&#8217;t understand from your proposal is how  PHP handles errors and how to address those errors at the application level.</p>
<p>I think delaying it until usage time is a good thing.  I think allowing the application to specify an encoding is a good thing.</p>
<p>But&#8230;</p>
<p>What happens when the conversion from raw to $_ variables fails because of an invalid character encoding?</p>
<p>How does the application find out about it?</p>
<p>Will the application (please!) get the converted data with offending characters dropped (rather than substituted for a question mark)?</p>
<p>Are you planning on providing some way to get at the raw data directly? (So we can build what we need if it won&#8217;t be part of base-PHP.)</p>
<p>Is there any provision for the PHP application developer to write a mapping function to replace well known problematic characters @ JIT-translation time?  For example, we are routinely POSTed latin1 encoded Euro symbols in otherwise UTF-8 encoded POSTs.</p>
<p>If suitable diagnostics are not provided by base-level PHP, the only application recourse I see, if you want to *sure* that you&#8217;re doing the right thing, is rip $_ to every encoding your application might expect to deal with (about 4-5 for us) and then do string comparisons to look for missing data.  Did I miss something?  (I looked @ php-dev list for a bit and didn&#8217;t find the answers to these questions.)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Basil Gohar</title>
		<link>http://zmievski.org/2007/02/php-6-and-request-decoding/comment-page-1#comment-623</link>
		<dc:creator>Basil Gohar</dc:creator>
		<pubDate>Thu, 22 Feb 2007 13:41:33 +0000</pubDate>
		<guid isPermaLink="false">http://www.gravitonic.com/n/?p=360#comment-623</guid>
		<description>I personally cannot find any serious flaw with this design - how do other environments handle dealing with Unicode?  Do they even parse incoming request incoming request information, or do they just allow runtime functions to handle the decoding?

Also, I am not entirely sure I understood Rasmus&#039; concerns.  Isn&#039;t the issue of bogus data present regardless of when/how the conversion process happens?  Is it a security concern, such that if the errors happened prior to the JIT routine, less harm could be done?
</description>
		<content:encoded><![CDATA[<p>I personally cannot find any serious flaw with this design &#8211; how do other environments handle dealing with Unicode?  Do they even parse incoming request incoming request information, or do they just allow runtime functions to handle the decoding?</p>
<p>Also, I am not entirely sure I understood Rasmus&#8217; concerns.  Isn&#8217;t the issue of bogus data present regardless of when/how the conversion process happens?  Is it a security concern, such that if the errors happened prior to the JIT routine, less harm could be done?</p>
]]></content:encoded>
	</item>
</channel>
</rss>
