Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release/2.5.0 #1144

Merged
merged 147 commits into from
Dec 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
147 commits
Select commit Hold shift + click to select a range
9f632a1
Merge pull request #1006 from JohnSnowLabs/main
chakravarthik27 Apr 3, 2024
0143d31
Merge pull request #1032 from JohnSnowLabs/main
chakravarthik27 May 17, 2024
5f1e5b6
Merge pull request #1051 from JohnSnowLabs/main
chakravarthik27 Jun 18, 2024
7eebf9f
Merge pull request #1079 from JohnSnowLabs/main
chakravarthik27 Jul 24, 2024
18b9a72
updated the small changes in quickstart.html
chakravarthik27 Jul 24, 2024
5324ccb
Merge pull request #1126 from JohnSnowLabs/main
chakravarthik27 Sep 23, 2024
7bc418c
resolved: set a round to 2 decimal points in accuracy and fairness re…
chakravarthik27 Sep 30, 2024
207bb63
In accuracy.py: Add DegradationAnalysis class for model performance d…
chakravarthik27 Sep 30, 2024
dfee514
Refactor model initialization in llm_modelhandler.py for azure-openai…
chakravarthik27 Oct 2, 2024
aa0bb6b
fixed the format and lint issues
chakravarthik27 Oct 2, 2024
cf836a6
Merge pull request #1128 from JohnSnowLabs/fix/basic-setup-within-dat…
chakravarthik27 Oct 2, 2024
1ebe0c9
Add DegradationAnalysis class for model performance degradation analy…
chakravarthik27 Oct 4, 2024
67f1c2e
updated: type annotations in base.py and accuracy.py
chakravarthik27 Oct 4, 2024
5c95206
Refactor `degradation_analysis` test in accuracy.py and type annotati…
chakravarthik27 Oct 7, 2024
0b62ae1
Refactor `degradation_analysis` test and type annotations in accuracy…
chakravarthik27 Oct 8, 2024
aace0f7
updated the type annotations and removed unnecessaries.
chakravarthik27 Oct 8, 2024
dc408e8
updated the docs.
chakravarthik27 Oct 9, 2024
b7fef1f
updated the degradation analysis
chakravarthik27 Oct 11, 2024
0c244dc
updated and integrated with Harness Object for degradation_analysis t…
chakravarthik27 Oct 14, 2024
e0a6b86
updated: type annotations.
chakravarthik27 Oct 15, 2024
c759246
updated model_type parameter in Harness and TaskManager classes
chakravarthik27 Oct 15, 2024
a3d1caa
updated: chat models integration from langchain module
chakravarthik27 Oct 16, 2024
36b1916
updated: handling openai and azureopenai hubs in proper way.
chakravarthik27 Oct 16, 2024
f4920a1
Updated: `PretrainedModelForQA` to support `model_type` parameter, en…
chakravarthik27 Oct 16, 2024
c3cedb0
Updated type annotations in TaskManager class
chakravarthik27 Oct 16, 2024
d456fba
updated: code to remove unnecessary kwargs in PretrainedModelForQA an…
chakravarthik27 Oct 17, 2024
7d07c6a
Update model type annotation to support 'completion' type in ModelCon…
chakravarthik27 Oct 17, 2024
e13b779
updated: formatting issues.
chakravarthik27 Oct 17, 2024
9b6862c
fixed temporary issues: in prompt manager
chakravarthik27 Oct 17, 2024
db57f31
fixed error: `model_kwargs` are not used by the model: ['return_full_…
chakravarthik27 Oct 17, 2024
572bbe4
removed unneccesary comments
chakravarthik27 Oct 19, 2024
1d2b02a
Refactor degradation_analysis test and type annotations, store the re…
chakravarthik27 Oct 20, 2024
3db1a94
Merge pull request #1129 from JohnSnowLabs/feature/implement-accuracy…
chakravarthik27 Oct 21, 2024
0d7589d
Merge pull request #1131 from JohnSnowLabs/feature/add-support-for-ch…
chakravarthik27 Oct 21, 2024
93f64ec
Updated: model report function in utils, to support different label c…
chakravarthik27 Oct 22, 2024
86455f4
Refactor model_report function in utils to support different label co…
chakravarthik27 Oct 23, 2024
3f16138
updated: eval_template parameter add to is_pass_eval func and remove …
chakravarthik27 Oct 24, 2024
09e0227
updated: annotation in BaseQASample
chakravarthik27 Oct 25, 2024
936680e
updated: llm_eval.py to add EvalTemplate class for building grading p…
chakravarthik27 Oct 25, 2024
c54fe53
updated: eval_prompt template configureable
chakravarthik27 Oct 26, 2024
0e9df84
updated: is_pass_llm_eval function to handle eval_template parameter
chakravarthik27 Oct 26, 2024
99a6b23
fixed: errors in report_utils.py
chakravarthik27 Oct 28, 2024
9123d9a
fixed: issues in testcases, and llm_eval.py and helpers.py for improv…
chakravarthik27 Oct 28, 2024
926802d
updated:
chakravarthik27 Oct 29, 2024
ebc4594
updated: QASample class to handle model parameters in evaluation conf…
chakravarthik27 Oct 30, 2024
3e2a559
updated: LlmEval class to support dynamic grading lists and modified …
chakravarthik27 Oct 30, 2024
d8e841a
Merge pull request #1133 from JohnSnowLabs/fix/basic-setup-within-dat…
chakravarthik27 Oct 30, 2024
94df648
updated: refined grade list pattern in LlmEval class for improved reg…
chakravarthik27 Oct 30, 2024
cc80ffe
updated: acclerate and spacy packages
chakravarthik27 Oct 31, 2024
7711ab1
updated: pydantic version and fixes errors in prompts.py
chakravarthik27 Oct 31, 2024
30bc463
fix issue at preuse parameter in field_validator
chakravarthik27 Oct 31, 2024
0abb62f
updated the imports for pydantic v1 from pydantic 2.9.2
chakravarthik27 Oct 31, 2024
e194e8f
refactor: update pydantic imports and validators to v1 syntax
chakravarthik27 Oct 31, 2024
2a497a1
updated: downgrade the accelerate package
chakravarthik27 Oct 31, 2024
f4556d4
updated: poetry lock file
chakravarthik27 Oct 31, 2024
b564e9d
reset
chakravarthik27 Oct 31, 2024
cb700f8
updated: upgrade pydantic to v2.9.2 and accelerate to v0.34.2; update…
chakravarthik27 Oct 31, 2024
235881c
updated: python min version : 3.9.0 to 4.0
chakravarthik27 Nov 4, 2024
9d4160e
updated: upgrade johnsnowlabs to v5.5.0, langchain to v0.3.6, langcha…
chakravarthik27 Nov 4, 2024
05bca67
updated: change import path for LLMChain from langchain.chains to lan…
chakravarthik27 Nov 4, 2024
4f2fb1b
added langchain-community package and updated dependencies in poetry.…
chakravarthik27 Nov 4, 2024
3ca61bd
updated: change exception handling in model loading tests to catch ge…
chakravarthik27 Nov 4, 2024
ab21446
Merge pull request #1135 from JohnSnowLabs/update/pydantic-version-an…
dcecchini Nov 4, 2024
187c185
added image transformation classes for robustness: ImageTranslate, Im…
chakravarthik27 Nov 5, 2024
6b6cb13
added ImageCorruptor class for image corruption transformations with …
chakravarthik27 Nov 6, 2024
e320189
added ImageLayeredMask class for image masking transformations with s…
chakravarthik27 Nov 6, 2024
166a716
removed MaskedImage class to streamline image transformation function…
chakravarthik27 Nov 6, 2024
fb33a5d
renamed ImageCorruptor class to ImageBlackSpot and updated alias and …
chakravarthik27 Nov 8, 2024
63d3bd4
added ImageTextOverlay class for image robustness with customizable t…
chakravarthik27 Nov 8, 2024
4be2fe5
refactor ImageTextOverlay class to improve text drawing readability
chakravarthik27 Nov 8, 2024
5bcdf10
added ImageWatermark class for applying customizable watermarks to im…
chakravarthik27 Nov 8, 2024
61f442f
added pytest-cov dependency, fixed formatting in robustness.py, and c…
chakravarthik27 Nov 12, 2024
a584619
refactor: clean up whitespace and improve test parameterization in im…
chakravarthik27 Nov 12, 2024
63cb465
reduced: train.conll for speedup testing.
chakravarthik27 Nov 12, 2024
6400604
added: create augmenter configuration and tests for DataAugmenter fun…
chakravarthik27 Nov 12, 2024
58b4f5b
renamed: test_augmenter to test_data_augmenter
chakravarthik27 Nov 12, 2024
888ef09
refactor: remove unused import from test_image_robustness.py
chakravarthik27 Nov 12, 2024
70a7d3a
Merge pull request #1132 from JohnSnowLabs/feature/enhance-harness-re…
chakravarthik27 Nov 18, 2024
b11fe41
Merge pull request #1138 from JohnSnowLabs/feature/random-masking-on-…
chakravarthik27 Nov 18, 2024
9eaf24d
Merge pull request #1140 from JohnSnowLabs/UnitTesting/add-new-unit-t…
chakravarthik27 Nov 18, 2024
37ebc70
added new overlay classes for enhanced image robustness and updated e…
chakravarthik27 Nov 18, 2024
53fd6ac
enhance ImageRandomTextOverlay with customizable font size, text coun…
chakravarthik27 Nov 18, 2024
3c9a1ff
Merge remote-tracking branch 'origin/release/2.5.0' into feature/rand…
chakravarthik27 Nov 18, 2024
24bf74e
refactor font handling in ImageTextOverlay and ImageRandomTextOverlay…
chakravarthik27 Nov 18, 2024
7d55902
remove unnecessary ImageFont imports from ImageTextOverlay and ImageR…
chakravarthik27 Nov 18, 2024
12cc07e
Merge pull request #1141 from JohnSnowLabs/feature/random-masking-on-…
chakravarthik27 Nov 18, 2024
db89d46
feat: add TypedDict configurations for robustness transformations
chakravarthik27 Nov 20, 2024
1f15db8
feat: enhance robustness tests configuration with TypedDict and type …
chakravarthik27 Nov 20, 2024
ae50a19
feat: add TypedDict configurations for bias tests and enhance type sa…
chakravarthik27 Nov 22, 2024
44cda54
feat: add TypedDict configurations for representation tests to enhanc…
chakravarthik27 Nov 22, 2024
86e48b6
feat: update available_tests method signatures to use TypedDict for i…
chakravarthik27 Nov 22, 2024
03a3412
feat: update TestConfig in representation classes to correct Union ty…
chakravarthik27 Nov 22, 2024
897cf45
feat: introduce HarnessConfig TypedDict and update Harness class to u…
chakravarthik27 Nov 22, 2024
8ec6e79
feat: add TypedDict configurations for fairness tests to enhance type…
chakravarthik27 Nov 22, 2024
5ec4bb9
feat: add TypedDict configurations for TestConfig in BaseAccuracy and…
chakravarthik27 Nov 22, 2024
432d87f
feat: add AccuracyTestsConfig TypedDict for defining accuracy test co…
chakravarthik27 Nov 22, 2024
b42bd58
feat: add ToxicityTestsConfig TypedDict and TestConfig for toxicity t…
chakravarthik27 Nov 22, 2024
b3a591e
feat: add TestConfig TypedDict for safety and security tests, and def…
chakravarthik27 Nov 22, 2024
1fd426e
feat: add TestConfig TypedDict for clinical, grammar, legal, and perf…
chakravarthik27 Nov 22, 2024
3fbdfd2
feat: add TestConfig TypedDict for various test factories to enhance …
chakravarthik27 Nov 22, 2024
aa9b2aa
fix: enhance model loading logic and update dependencies for compatib…
chakravarthik27 Nov 23, 2024
21a9d2b
Merge pull request #1143 from JohnSnowLabs/annotations/improve-the-ty…
chakravarthik27 Nov 23, 2024
3523005
Merge pull request #1145 from JohnSnowLabs/fix/basic-setup-within-dat…
chakravarthik27 Nov 23, 2024
d0d8261
fix: improve model_report function to handle numeric values and initi…
chakravarthik27 Nov 27, 2024
85b3fc9
Merge pull request #1146 from JohnSnowLabs/feature/enhance-harness-re…
chakravarthik27 Nov 27, 2024
20f9f10
adding support for spark dataset in databricks
chakravarthik27 Nov 28, 2024
62163fa
feat: enhance SparkDataset to accept a SparkSession and support dynam…
chakravarthik27 Nov 28, 2024
4e90228
feat: update SparkDataset to dynamically initialize SparkSession and …
chakravarthik27 Nov 29, 2024
98084dd
feat: enhance SparkDataset to support dynamic file paths and improve …
chakravarthik27 Nov 29, 2024
da0a439
feat: extend SparkDataset to load data from any file and introduce Dl…
chakravarthik27 Nov 30, 2024
5589255
feat: add validation for DltDataset file_path and implement data expo…
chakravarthik27 Dec 2, 2024
3bd5ca0
feat: rename DltDataset to DeltaLiveTablesDataset and update dataset …
chakravarthik27 Dec 2, 2024
0578ce6
Merge pull request #1148 from JohnSnowLabs/feature/support-for-loadin…
chakravarthik27 Dec 2, 2024
8a2f061
feat: update dependency version constraints in pyproject.toml for imp…
chakravarthik27 Dec 5, 2024
0742714
Merge pull request #1149 from JohnSnowLabs/update/pydantic-version-an…
chakravarthik27 Dec 5, 2024
b9fbae4
Merge remote-tracking branch 'origin/release/2.5.0' into chore/final_…
chakravarthik27 Dec 9, 2024
76e72fd
updated the release notes in website
chakravarthik27 Dec 9, 2024
37c833e
updated the pagination for release notes
chakravarthik27 Dec 9, 2024
41236de
updated: typos in layout
chakravarthik27 Dec 9, 2024
324ddb0
add integrations link in navigation.yml
chakravarthik27 Dec 9, 2024
4b1a48e
added the content for databricks integration with langtest.
chakravarthik27 Dec 9, 2024
b7c9fac
Update docs/pages/docs/langtest_versions/release_notes_2_3_1.md
chakravarthik27 Dec 13, 2024
de628f8
updated: added FAQ section to troubleshooting guide for Databricks in…
chakravarthik27 Dec 13, 2024
f70495d
updated the workflow and add results df to dlt tables.
chakravarthik27 Dec 13, 2024
f6a7ead
added the notebook for degradation analysis test
chakravarthik27 Dec 16, 2024
652d688
feat: enhance DegradationAnalysis to support question-answering tasks…
chakravarthik27 Dec 16, 2024
5921cf1
feat: skip samples with None ground truth in DegradationAnalysis accu…
chakravarthik27 Dec 16, 2024
cc46917
fix: correctly decrement total count when skipping samples with None …
chakravarthik27 Dec 16, 2024
c30a310
fix: handle cases where ground truth is missing in DegradationAnalysi…
chakravarthik27 Dec 16, 2024
d1c18ae
feat: make qa_evaluation a static method in DegradationAnalysis for q…
chakravarthik27 Dec 16, 2024
c672e5b
refactor: update variable names for clarity in DegradationAnalysis ac…
chakravarthik27 Dec 17, 2024
b495108
Update langtest/transform/accuracy.py
chakravarthik27 Dec 17, 2024
2e45753
Merge pull request #1153 from JohnSnowLabs/feature/support-for-qa-tas…
chakravarthik27 Dec 17, 2024
497b86b
Merge pull request #1150 from JohnSnowLabs/chore/final_website_updates
chakravarthik27 Dec 17, 2024
be7fa29
updated the notebook.
chakravarthik27 Dec 17, 2024
2620c4d
feat: add langchain-openai to databricks dependencies and create llms…
chakravarthik27 Dec 17, 2024
f5619b9
feat: updated the poetry.lock file for add langchain-openai to databr…
chakravarthik27 Dec 18, 2024
a41f584
feat: handle dictionary input for prompt formatting in PretrainedMode…
chakravarthik27 Dec 18, 2024
596d987
feat: update prompt handling to support 'instruct' and 'completion' t…
chakravarthik27 Dec 18, 2024
0dc2a6d
update notebook with new features and improvements
chakravarthik27 Dec 20, 2024
58d4715
added new Visual_QA_II.ipynb
chakravarthik27 Dec 20, 2024
9911e8b
fix: update Colab link in Visual_QA_II notebook to point to the corre…
chakravarthik27 Dec 20, 2024
417615f
added the nb for custom chat template config.
chakravarthik27 Dec 23, 2024
7b28f91
refactor: remove unnecessary import statements from Custom_Chat_Templ…
chakravarthik27 Dec 23, 2024
300d7e1
updated nb with gpt-4o llm evaluation
chakravarthik27 Dec 23, 2024
3e435af
Merge pull request #1155 from JohnSnowLabs/chore/final_website_updates
chakravarthik27 Dec 23, 2024
ac5d98f
renamed the notebook "LangTest_Databricks_Integration"
chakravarthik27 Dec 23, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/build_and_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [ "3.8", "3.9","3.10" ]
python-version: [ "3.9","3.10", "3.11" ]

steps:
- name: Free up disk space at start
Expand Down
1,637 changes: 1,637 additions & 0 deletions demo/tutorials/llm_notebooks/LangTest_Databricks_Integration.ipynb

Large diffs are not rendered by default.

936 changes: 936 additions & 0 deletions demo/tutorials/llm_notebooks/Visual_QA_II.ipynb

Large diffs are not rendered by default.

5,626 changes: 5,626 additions & 0 deletions demo/tutorials/misc/Custom_Chat_Template_Config.ipynb

Large diffs are not rendered by default.

3,223 changes: 3,223 additions & 0 deletions demo/tutorials/misc/Degradation_Analysis_Test.ipynb

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions docs/_data/navigation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ docs-menu:
url: /docs/pages/docs/install
- title: One Liners
url: /docs/pages/docs/one_liner
- title: Integrations
url: /docs/pages/docs/integrations

- title: General Concepts
url: /docs/pages/docs/harness
Expand Down
9 changes: 7 additions & 2 deletions docs/_includes/docs-langtest-pagination.html
Original file line number Diff line number Diff line change
@@ -1,15 +1,20 @@
<ul class="pagination owl-carousel pagination_big">
<!-- <li><a href="release_notes_2_5_0">2.5.0</a></li> -->
<li><a href="release_notes_2_4_0">2.4.0</a></li>
<li><a href="release_notes_2_3_1">2.3.1</a></li>
<li><a href="release_notes_2_3_0">2.3.0</a></li>
<li><a href="release_notes_2_2_0">2.2.0</a></li>
<li><a href="release_notes_2_1_0">2.1.0</a></li>
<li><a href="release_notes_2_0_0">2.0.0</a></li>
<li><a href="release_notes_1_10_0">1.10.0</a></li>
<li><a href="release_notes_1_9_0">1.9.0</a></li>
<li><a href="release_notes_1_8_0">1.8.0</a></li>
<li><a href="release_notes_1_7_0">1.7.0</a></li>
<li><a href="release_notes_1_6_0">1.6.0</a></li>
<li><a href="release_notes_1_5_0">1.5.0</a></li>
<!-- <li><a href="release_notes_1_5_0">1.5.0</a></li>
<li><a href="release_notes_1_4_0">1.4.0</a></li>
<li><a href="release_notes_1_3_0">1.3.0</a></li>
<li><a href="release_notes_1_2_0">1.2.0</a></li>
<!-- <li><a href="release_notes_1_1_0">1.1.0</a></li>
<li><a href="release_notes_1_1_0">1.1.0</a></li>
<li><a href="release_notes_1_0_0">1.0.0</a></li> -->
</ul>
11 changes: 2 additions & 9 deletions docs/api/quick_start.html
Original file line number Diff line number Diff line change
Expand Up @@ -368,7 +368,7 @@ <h1>Quick Start<a class="headerlink" href="#quick-start" title="Permalink to thi
<h2>LangTest Quick Start<a class="headerlink" href="#langtest-quick-start" title="Permalink to this heading">#</a></h2>
<p>The following can be used as a quick reference on how to get up and running with <code class="docutils literal notranslate"><span class="pre">langtest</span></code>:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span><span class="c1"># Install langtest from PyPI</span>
pip<span class="w"> </span>install<span class="w"> </span><span class="nv">langtest</span><span class="o">==</span><span class="m">1</span>.1.0
pip<span class="w"> </span>install<span class="w"> </span><span class="nv">langtest</span><span class="o">==</span><span class="m">2</span>.3.1
</pre></div>
</div>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">langtest</span> <span class="kn">import</span> <span class="n">Harness</span>
Expand All @@ -386,20 +386,13 @@ <h2>Alternative Installation Options<a class="headerlink" href="#alternative-ins
<p>We can create a Python <a class="reference external" href="https://virtualenv.pypa.io/en/latest/">Virtualenv</a>:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>virtualenv<span class="w"> </span>langtest<span class="w"> </span>--python<span class="o">=</span>python3.8
<span class="nb">source</span><span class="w"> </span>langtest/bin/activate
pip<span class="w"> </span>install<span class="w"> </span><span class="nv">langtest</span><span class="o">==</span><span class="m">1</span>.1.0<span class="w"> </span>jupyter
pip<span class="w"> </span>install<span class="w"> </span><span class="nv">langtest</span><span class="o">==</span><span class="m">2</span>.3.1<span class="w"> </span>jupyter
</pre></div>
</div>
<p>Now you should be ready to create a jupyter notebook with LangTest running:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>jupyter<span class="w"> </span>notebook
</pre></div>
</div>
<p>We can also use conda and create a new <a class="reference external" href="https://docs.conda.io/projects/conda/en/latest/index.html">conda</a> environment to manage all the dependencies there.</p>
<p>Then we can create a new environment <code class="docutils literal notranslate"><span class="pre">langtest</span></code> and install the <code class="docutils literal notranslate"><span class="pre">langtest</span></code> package with pip:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>conda<span class="w"> </span>create<span class="w"> </span>-n<span class="w"> </span>langtest<span class="w"> </span><span class="nv">python</span><span class="o">=</span><span class="m">3</span>.8<span class="w"> </span>-y
conda<span class="w"> </span>activate<span class="w"> </span>langtest
conda<span class="w"> </span>install<span class="w"> </span>-c<span class="w"> </span><span class="nv">langtest</span><span class="o">==</span><span class="m">1</span>.1.0<span class="w"> </span>jupyter
</pre></div>
</div>
<p>Now you should be ready to create a jupyter notebook with LangTest running:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>jupyter<span class="w"> </span>notebook
</pre></div>
Expand Down
167 changes: 167 additions & 0 deletions docs/pages/docs/integrations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
---
layout: docs
seotitle: Integrations | LangTest | John Snow Labs
title: Integrations
permalink: /docs/pages/docs/integrations
key: docs-integrations
modify_date: "2023-03-28"
header: true
---

<div class="main-docs" markdown="1">
<div class="h3-box" markdown="1">


**LangTest** is an open-source Python library that empowers developers to build safe and reliable Natural Language Processing (NLP) models. It seamlessly integrates with popular platforms and tools, including **Databricks**, enabling scalable testing and evaluation. Install LangTest easily using pip to enhance your NLP workflows.

</div>
<div class="h3-box" markdown="1">

## Databricks

**Introduction**
LangTest is a powerful tool for testing and evaluating NLP models, and integrating it with Databricks allows users to scale their testing with large datasets and leverage real-time analytics. This integration streamlines the process of assessing model performance, ensuring high-quality results while maintaining scalability and efficiency. With Databricks, LangTest becomes an even more versatile solution for NLP practitioners working with substantial data pipelines and diverse datasets.

**Prerequisites**
Before starting, ensure you meet the following requirements. You need access to a Databricks Workspace and an installed version of the `LangTest` package (version `2.5.0` or `later`). Additionally, make sure you have your Databricks API keys or credentials ready and have Python (version 3.9 or later) installed on your system. Optionally, access to sample datasets is helpful for testing and exploring features during your initial setup.

#### **Step-by-Step Setup**

Getting started with LangTest and Databricks is straightforward and involves a few simple steps. Follow the instructions below to set up and run your first NLP model test.

1. **Install LangTest and Dependencies**
Begin by installing LangTest using pip:
```bash
pip install langtest==2.5.0
```
Ensure all required dependencies are installed and your environment is ready.

2. **Load Datasets from Databricks**
Use the Databricks connector to load data directly into your LangTest pipeline:
```python
from pyspark.sql import DataFrame

# Load the dataset into a Spark DataFrame
df: DataFrame = spark.read.json("<FILE_PATH>")

```
print the dataframe schema
```python
df.printSchema()
```

3. **Configuration**
In this section, we will configure the tests, datasets, and model settings required to effectively use LangTest. This includes setting up the test parameters, loading datasets, and defining the model configuration to ensure seamless integration and accurate evaluation.

- **Tests Config:**

```python
test_config = {
"tests": {
"defaults": {"min_pass_rate": 1.0},
"robustness": {
"add_typo": {"min_pass_rate": 0.7},
"lowercase": {"min_pass_rate": 0.7},
},
},
}
```

- **Dataset Config:**

```python
input_data = {
"data_source": df,
"source": "spark",
"spark_session": spark # make sure that spark session is started or not
}
```

- **Model Config:**

```python
model_config = {
"model": {
"endpoint": "databricks-meta-llama-3-1-70b-instruct",
},
"hub": "databricks",
"type": "chat"
}
```


4. **Set Up and Run Tests with Harness**
Use the `Harness` class to configure, generate, and execute tests. Define your task, model, data, and configuration:

```python
harness = Harness(
task="question-answering",
model=model_config,
data=input_data,
config=test_config
)
```

Generate and Execute the testcases on model to evaluate with langtest:
```python
harness.generate().run().report()
```

To Review the Testcases:
```python
testcases_df = harness.testcases()
testcases_df
```

To save testcases in delta live tables
```python
import os
from deltalake import DeltaTable
from deltalake.writer import write_deltalake

write_deltalake("tmp/langtest_testcases", testcases_df) # for existed tables, pass mode="append"

```

To Review the Generated Results
```python
results_df = harness.generated_results()
results_df
```

Similary, for results_df in delta live tables.
```python
import os
from deltalake import DeltaTable
from deltalake.writer import write_deltalake

write_deltalake("tmp/langtest_generated_results", results_df) # for existed tables, pass mode="append"

```

This process evaluates your model's performance on the loaded data and provides a comprehensive report of the results.

By following these steps, you can easily integrate Databricks with LangTest to perform NLP or LLM model testing. If you encounter issues during setup or execution, refer to the troubleshooting section for solutions.

**Troubleshooting & Support**
While setting up, you may encounter common issues like authentication errors with Databricks, incorrect dataset paths, or model compatibility problems. To resolve these, verify your API keys and workspace URL, ensure the specified dataset exists in Databricks, and confirm that your LangTest version is compatible with your project. If further help is needed, explore the FAQ section, access detailed documentation, or reach out through the support channels or community forum for assistance.

### FAQ

**Q: How do I resolve authentication errors with Databricks?**
A: Ensure that your API keys and workspace URL are correct. Double-check that your credentials have the necessary permissions to access the Databricks workspace.

**Q: What should I do if the dataset path is incorrect?**
A: Verify that the specified dataset exists in Databricks and that the path is correctly formatted. You can use the Databricks UI to navigate and confirm the dataset location.

**Q: How can I check if my LangTest version is compatible with my project?**
A: Refer to the LangTest documentation for version compatibility information. Ensure that you are using a version of LangTest that supports the features and integrations required for your project.

**Q: Where can I find more detailed documentation?**
A: Access the detailed documentation on the LangTest official website or the Databricks documentation portal for comprehensive guides and examples.

**Q: How can I get additional support?**
A: Reach out through the support channels provided by LangTest or Databricks. You can also join the community forum to ask questions and share experiences with other users.


</div></div>
Loading
Loading