mirror of https://github.com/01-edu/public.git
eslopfer
2 years ago
1 changed files with 42 additions and 0 deletions
@ -0,0 +1,42 @@
|
||||
## string_tokenizer_count |
||||
|
||||
### Instructions |
||||
|
||||
Create a file string_tokenizer_count.py that contains a function tokenizer_counter which takes in a string as a parameter and returns a dictionary of words and their count in the string. |
||||
|
||||
- The function should remove any punctuation from the string and convert it to lowercase before counting the words. |
||||
|
||||
- The function should return a dictionary of words and their count, sorted alphabetically by word. |
||||
|
||||
### Usage |
||||
|
||||
Here is an example of how to use the function: |
||||
|
||||
```python |
||||
string = "This is a test sentence, with various words and 123 numbers!" |
||||
result = tokenizer_counter(string) |
||||
print(string) |
||||
``` |
||||
|
||||
And its output: |
||||
|
||||
```console |
||||
string = "This is a test sentence, with various words and 123 numbers!" |
||||
result = tokenizer_counter(string) |
||||
``` |
||||
|
||||
### Hints |
||||
|
||||
- The `re` module can be used to remove non-alphanumeric characters. |
||||
|
||||
- The `collections` module can be used to count the words. |
||||
|
||||
- The `operator` module can be used to sort the dictionary alphabetically by word. |
||||
|
||||
### References |
||||
|
||||
- [`re` module](https://docs.python.org/3/library/re.html) |
||||
|
||||
- [`collections` module](https://docs.python.org/3/library/collections.html) |
||||
|
||||
- [`operator` module](https://docs.python.org/3/library/operator.html) |
Loading…
Reference in new issue