Browse Source

🔥 Avoid setting/imposing TRACE level for everyone (#165)

* 🔥 Avoid setting/imposing TRACE level for everyone

Staying as far as humanly possible from any controversial situation

* 📝 Add docs entry about this
pull/166/head
TAHRI Ahmed R 5 months ago committed by GitHub
parent
commit
d2d4217955
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 1
      README.md
  2. 3
      charset_normalizer/api.py
  3. 26
      docs/user/miscellaneous.rst
  4. 6
      tests/test_logging.py

1
README.md

@ -39,6 +39,7 @@ This project offers you an alternative to **Universal Charset Encoding Detector*
<img src="https://i.imgflip.com/373iay.gif" alt="Reading Normalized Text" width="226"/><img src="https://media.tenor.com/images/c0180f70732a18b4965448d33adba3d0/tenor.gif" alt="Cat Reading Text" width="200"/>
*\*\* : They are clearly using specific code for a specific encoding even if covering most of used one*<br>
Did you got there because of the logs? See [https://charset-normalizer.readthedocs.io/en/latest/user/miscellaneous.html](https://charset-normalizer.readthedocs.io/en/latest/user/miscellaneous.html)
## ⭐ Your support

3
charset_normalizer/api.py

@ -25,7 +25,8 @@ from .utils import (
should_strip_sig_or_bom,
)
logging.addLevelName(TRACE, "TRACE")
# Will most likely be controversial
# logging.addLevelName(TRACE, "TRACE")
logger = logging.getLogger("charset_normalizer")
explain_handler = logging.StreamHandler()
explain_handler.setFormatter(

26
docs/user/miscellaneous.rst

@ -18,3 +18,29 @@ Any ``CharsetMatch`` object can be transformed to exploitable ``str`` variable.
# This should print '我没有埋怨,磋砣的只是一些时间。'
print(str(result))
Logging
-------
Prior to the version 2.0.10 you may encounter some unexpected logs in your streams.
Something along the line of:
::
... | WARNING | override steps (5) and chunk_size (512) as content does not fit (465 byte(s) given) parameters.
... | INFO | ascii passed initial chaos probing. Mean measured chaos is 0.000000 %
... | INFO | ascii should target any language(s) of ['Latin Based']
It is most likely because you altered the root getLogger instance. The package has its own logic behind logging and why
it is useful. See https://docs.python.org/3/howto/logging.html to learn the basics.
If you are looking to silence and/or reduce drastically the amount of logs, please upgrade to the latest version
available for `charset-normalizer` using your package manager or by `pip install charset-normalizer -U`.
The latest version will no longer produce any entry greater than `DEBUG`.
On `DEBUG` only one entry will be observed and that is about the detection result.
Then regarding the others log entries, they will be pushed as `Level 5`. Commonly known as TRACE level, but we do
not register it globally.

6
tests/test_logging.py

@ -18,7 +18,7 @@ class TestLogBehaviorClass:
from_bytes(test_sequence, steps=1, chunk_size=50, explain=True)
assert explain_handler not in self.logger.handlers
for record in caplog.records:
assert record.levelname in ["TRACE", "DEBUG"]
assert record.levelname in ["Level 5", "DEBUG"]
def test_explain_false_handler_set_behavior(self, caplog):
test_sequence = b'This is a test sequence of bytes that should be sufficient'
@ -26,7 +26,7 @@ class TestLogBehaviorClass:
from_bytes(test_sequence, steps=1, chunk_size=50, explain=False)
assert any(isinstance(hdl, logging.StreamHandler) for hdl in self.logger.handlers)
for record in caplog.records:
assert record.levelname in ["TRACE", "DEBUG"]
assert record.levelname in ["Level 5", "DEBUG"]
assert "Encoding detection: ascii is most likely the one." in caplog.text
def test_set_stream_handler(self, caplog):
@ -35,7 +35,7 @@ class TestLogBehaviorClass:
)
self.logger.debug("log content should log with default format")
for record in caplog.records:
assert record.levelname in ["TRACE", "DEBUG"]
assert record.levelname in ["Level 5", "DEBUG"]
assert "log content should log with default format" in caplog.text
def test_set_stream_handler_format(self, caplog):

Loading…
Cancel
Save