`_extract_representative_docs` samples with `replace=True`, producing duplicate representative documents

### Describe the bug

`_extract_representative_docs` calls `sample(nr_samples, replace=True)` on each topic's document pool. When a topic has fewer than `nr_samples` unique documents (common for small topics), the same document can be drawn multiple times. These duplicates are then fed into the c-TF-IDF similarity calculation, inflating scores and producing duplicate entries in `representative_docs_`.

The existing `.drop_duplicates()` runs after `.groupby("Topic").sample(...)`, so it only removes exact duplicate rows across the entire result — it does not prevent `replace=True` from drawing the same document multiple times within a single topic's sample.

### Reproduction

```python
from bertopic import BERTopic
from sklearn.datasets import fetch_20newsgroups

docs = fetch_20newsgroups(subset="all", remove=("headers", "footers", "quotes"))["data"][:500]
topic_model = BERTopic(min_topic_size=5)
topics, _ = topic_model.fit_transform(docs)

# Check for duplicate representative docs within the same topic
for topic_id, docs_list in topic_model.representative_docs_.items():
    if len(docs_list) != len(set(docs_list)):
        print(f"Topic {topic_id}: {len(docs_list)} docs, {len(set(docs_list))} unique")
```

### BERTopic Version

0.17.4

### Your contribution

I've already worked through a fix for this in my fork, with tests. Happy to open a PR if this looks like the right approach — just let me know.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`_extract_representative_docs` samples with `replace=True`, producing duplicate representative documents #2491

Describe the bug

Reproduction

BERTopic Version

Your contribution

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

_extract_representative_docs samples with replace=True, producing duplicate representative documents #2491

Description

Describe the bug

Reproduction

BERTopic Version

Your contribution

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

`_extract_representative_docs` samples with `replace=True`, producing duplicate representative documents #2491