INDEX
Explanations
names and references to specific individuals or companies
personal statements and vulnerable people
New Auto-Interp
Negative Logits
COVID
-0.45
COVID
-0.42
covid
-0.42
SharedCtor
-0.42
https
-0.41
️⃣
-0.41
continúas
-0.39
✅
-0.39
covid
-0.38
ChatGPT
-0.38
POSITIVE LOGITS
desmotivaciones
0.54
kasarigan
0.50
ſont
0.48
increí
0.47
femeninos
0.46
motivadoras
0.46
ainfi
0.46
étoient
0.44
stoß
0.42
jden
0.42
Activations Density 0.381%