Banshee
the secure PHP framework

Forum

umlaut in search hits

Heiko
7 june 2018, 23:23
I get after a search in the preview of the hits not the umlauts, its shown with html special chars:

ö and not a ö


In the html-source is:
ö 

and so it is not working in the browser.

I tried this in the search controller:
str_replace("&","&",$hit["content"]);

But this does not help me.

I dont find the place where can I fix it correctly.

How can I do this propper?
I think this is an issue for more german users.

Thank you.
Heiko
Hugo Leisink
8 june 2018, 00:39
Weird, When I enter a ö in a page on my website and search for it, it is displayed correctly.
Joe Schmoe
8 june 2018, 13:45
What web server are you using? Is it sending a different Content-Type header?

Here is an example from the Bansee website using the curl command.

root@host$. curl -I https://www.banshee-php.org


HTTP/1.1 200 OK
Date: Fri, 08 Jun 2018 11:37:54 GMT
Server: Hiawatha v10.9
Connection: keep-alive
Strict-Transport-Security: max-age=31536000
Set-Cookie: banshee_session_id=bef8fc3f2baba9fb26bf46d2c925bcad3b41ed5c47c3995b22ca9e158101cdc0aaacecf8a7ae736bb9e10636c7db1e31dcb6ee7937c6c7e125a502e69b98476e; path=/; HttpOnly
X-Frame-Options: sameorigin
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
Content-Security-Policy: default-src 'self' 'unsafe-inline' 'unsafe-eval'; img-src * data:
Referrer-Policy: same-origin
Content-Type: text/html; charset=utf-8
Content-Language: en
Content-Length: 4125
Vary: Accept-Encoding
X-Powered-By: Banshee PHP framework v6.3
Heiko
9 june 2018, 08:20
Hiawatha. And UTf-8 is always working propper.

The issue is only in the /search module. In the page output is all correct.

And I can reproduce it on demo.bansee-php.org too. Create a page and search for the contant. The Hit-output is wrong when there is an ü in the content. CKEditor uses always the HTML ü instead of the ü self. And I edit my Pages always with CKEditor.

When I manuelly change to ü, then there is no issue. But when there is ü then the output from the /search module replaces the & to &

It looks like there is an escaping "&" to "&"
And I want to deactivate it in the /search. But I dont find it how to do.
Heiko
9 june 2018, 09:14
This in controllers/search.php would solve it, but is very unsightly:
<code>
$chars = array("&auml;", "&ouml;", "&uuml;", "&Auml;", "&Ouml;", "&Uuml;", "&szlig;");
$replace = array("ä", "ö", "ü", "Ä", "Ö", "Ü", "ß");
$hit["content"] = str_replace($chars, $replace, $hit["content"]);
</code>
Hugo Leisink
9 june 2018, 09:27
I've created a test page [demo.banshee-php.org]. If I search for 'test', I see this in the result:

This is a test. Char: ü Code: &uuml;


Looks ok to me.
Heiko
9 june 2018, 12:46
When you create on demo a new page with this content:

<p>heiko</p>
<p>html special: &auml;&ouml;&uuml;&nbsp;</p>
<p> ori: äöü </p>

Then enter the search page and search for: heiko

Then you see here the output: https://pasteboard.co/Hp54L2C.png

And the output is wrong.
Hugo Leisink
9 june 2018, 20:48
To fix this, add disable-output-escaping="yes" to the xsl:value-of tag at line 48 of views/search.xslt. I haven't fully looked that the security implications of this change, so be cautious.
Heiko
10 june 2018, 10:51
This is working for me. This is the option I did not found and was asking for.
Heiko
10 june 2018, 11:19
In weblog is the same option used. In the dafault website layout not. Please let us know, when you think there is a security risk.
Thank you. Then I would use my "dirty" workaround.
Hugo Leisink
10 june 2018, 22:16
The weblog message does not contain data from visitors. The search potentially does (forum messages).
Heiko
24 june 2018, 16:15
Ok, when I avoid a forum-search, then there is not XSS possible. Thanks.
Heiko
2 july 2018, 11:12
Update: the best solution is, to change ckeditor settings and use utf-8:
CKEDITOR.config.entities = false;
Hugo Leisink
3 july 2018, 00:54
Thanks!
Message preview

The following BB-codes are available in a message:

  • [b]Bold text[/b]
  • [center]Center text or imagen[/center]
  • [color=color name or #RGB code]Colored text[/color]
  • [i]Italic text[/i]
  • [img]Link to image[/img]
  • [right]Align text or image right[/right]
  • [s]Strike-through text[/s]
  • [size=pixelsize]Big or small text[/size]
  • [u]Underlined text[/u]
  • [url]Link to website[/url]
  • [url=link to website]Link text[/url]