yay more mandatory training on AI... /s
-
yay more mandatory training on AI... /s
-
aaaaaaaaaaaaaa the expert one is over 7 hours
-
R AodeRelay shared this topic
-
"ai is the new electricity"
f right off -
apparently the metrics used to evaluate llm-based systems don't come from anything grounded in reality. they just pass the prompt and response pairs to an llm and ask it to evaluate them. usually the llm doing the evaluation is the same one being evaluated.
so much of this feels entirely unscientific. engineers are treating llms as these magical infallible black boxes without understanding their specific strengths and limitations. it's the ultimate hammer and now literally every problem looks like a nail.
it's also very comical that the examples they are using in the lab produce incredibly generic and horribly useless responses but the llm is scoring them very high in all metrics.