
Closed
Posted
Paid on delivery
I’m evaluating security vulnerabilities in three Arabic-capable language models—Allam, Falcon, and Fanar—by running the Garak prompt-injection suite. My top priority is the technical implementation, with a particular emphasis on translating each of Garak’s 256 English attack prompts into clear, natural Arabic before the tests run. Here’s how the workflow looks: • Build a Python notebook that loads the three models from Hugging Face (PyTorch backend), pipes the Arabic prompts through Garak, captures logits and full responses, and writes everything to tidy CSV files. • Include bilingual testing so the notebook can toggle between the original English prompts and their Arabic counterparts, allowing side-by-side success-rate comparison. • Produce a concise pandas analysis section that calculates and visualises attack success percentages per model and per language. • Document every step—from model acquisition commands to translation strategy and evaluation metrics—in markdown cells so the methodology is fully reproducible. Acceptance criteria • Notebook runs end-to-end on a fresh environment (tested with Python 3.10, Transformers latest, and Garak). • CSV result files and an HTML/PDF export of the notebook are generated automatically. • Translation quality preserves the adversarial intent of each prompt; no machine-literal phrasing that could weaken the attack. • Final section summarises Arabic vs. English success rates and highlights any model-specific patterns. If you’re comfortable with Hugging Face Transformers, pandas, and the nuances of Arabic prompting in LLM security contexts, I’d love to collaborate and get this testing pipeline running smoothly.
Project ID: 40228724
50 proposals
Remote project
Active 1 mo ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
50 freelancers are bidding on average $171 USD for this job

⭐⭐⭐⭐⭐ Evaluate Security Vulnerabilities in Arabic Language Models Efficiently ❇️ Hi My Friend, I hope you're doing well. I've reviewed your project requirements and see you're looking for an expert to evaluate security vulnerabilities in Arabic language models. Look no further; Zohaib is here to help you! My team has successfully completed over 50 similar projects focused on security evaluation. I will build a Python notebook to load the models, translate prompts, and analyze results, ensuring everything runs smoothly within your budget. ➡️ Why Me? I can easily handle your project as I have 5 years of experience in Python programming and security evaluations. My expertise includes working with Hugging Face, data analysis, and prompt engineering. I also have a strong grip on other relevant technologies, ensuring a thorough approach to your project. ➡️ Let's have a quick chat to discuss your project in detail, and I can showcase samples of my previous work. I look forward to discussing this with you in our chat. ➡️ Skills & Experience: ✅ Python Programming ✅ Hugging Face Transformers ✅ Data Analysis with Pandas ✅ Security Testing ✅ Bilingual Testing ✅ CSV File Management ✅ Markdown Documentation ✅ Logits Capture ✅ Model Evaluation ✅ Prompt Translation ✅ Adversarial Testing ✅ Performance Visualization Waiting for your response! Best Regards, Zohaib
$150 USD in 2 days
7.8
7.8

✅ Proposal for Arabic LLM Prompt Injection T With a strong background in cybersecurity and natural language processing, I am ideally suited to execute the Arabic LLM Prompt Injection T project. My proficiency in Python, experience with Hugging Face Transformers, and expertise in both English and Arabic linguistic nuances ensure a flawless translation and testing process. I have previously conducted similar security analyses on language models, focusing on maintaining the adversarial intent in translations. My technical skills in data analysis with pandas will enable precise success rate visualization and comparison. I am committed to delivering a comprehensive and reproducible notebook that meets all acceptance criteria. Let’s collaborate to enhance the security framework of these models.
$250 USD in 7 days
6.9
6.9

Hello, I trust you're doing well. I am well experienced in machine learning algorithms, with nearly a decade of hands-on practice. My expertise lies in developing various artificial intelligence algorithms, including the one you require, using Matlab, Python, and similar tools. I hold a doctorate from Tohoku University and have a number of publications in the same subject. My portfolio, which showcases my past work, is available for your review. Your project piqued my interest, and I would be delighted to be part of it. Let's connect to discuss in detail. Warm regards. please check my portfolio link: https://www.freelancer.com/u/sajjadtaghvaeifr
$300 USD in 7 days
7.2
7.2

Assalamo alaikom, I have experience benchmarking, fine tuning and even pretraining LLMs and would be happy to work on your project. Let me know if you are available for a discussion, please check my reviews also.
$500 USD in 7 days
6.3
6.3

Being experts in A.I and Machine Learning, my team and I are well-versed in using Python, which is crucial for this project's successful implementation of the Garak prompt-injection suite. Our proficiency with Hugging Face Transformers will enable us to effortlessly load the Allam, Falcon, and Fanar models from PyTorch, as required for your environment. The task at hand demands a nuanced understanding not only of the technical aspects but also of the tactical use of Arabic language models (LLM) in cybersecurity. Our deep knowledge of Arabic and fluency in Python, particularly with Pandas, would ensure smooth processing of your CSV files and an accurate linguistic translation without compromising the effectiveness of each attack. Additionally, our commitment to thorough documentation and reproducible methodologies aligns perfectly with your acceptance criterion. We've also got you covered for automatic generation of the CSV files and HTML/PDF export of the notebook that wraps up comprehensive insights about Arabic vs English success rates and notable model-specific patterns. In a nutshell, our extensive experience amalgamated with our passion for delivering quality work would make us an ideal fit for this task. We look forward to collaborating with you on this significant project and taking your LLM security testing pipeline to new heights!
$250 USD in 7 days
6.3
6.3

Hello, I have over 7 years of experience in Machine Learning (ML) and Python. I have carefully read the requirements for evaluating security vulnerabilities in Arabic-capable language models using the Garak prompt-injection suite. To complete this project, I will first build a Python notebook that loads the models from Hugging Face with a PyTorch backend. The notebook will translate the English attack prompts into clear Arabic, capture logits and responses, and save the data in CSV files. Additionally, I will implement bilingual testing to compare success rates between the original English prompts and their Arabic translations. A pandas analysis section will be included to visualize attack success percentages per model and language. I will thoroughly document each step in markdown cells to ensure reproducibility. The acceptance criteria include running the notebook on a fresh environment, generating result files automatically, maintaining high translation quality, and summarizing success rates effectively. I am keen to discuss this project further in detail. Please connect with me for a chat. You can visit my Profile: https://www.freelancer.com/u/HiraMahmood4072 Thank you.
$100 USD in 2 days
5.4
5.4

With my solid background in Full Stack Web and Mobile Application Development, as well as my expertise in Python and Machine Learning (particularly with TensorFlow and PyTorch), I am well-equipped to meet all the technical challenges of your Arabic LLM Prompt Injection Testing project. I have worked extensively with Hugging Face Transformers and am familiar with the nuances of Arabic prompting in LLM security contexts. This means I can effectively build a Python notebook that performs all the necessary tasks from loading the models to translating prompts, capturing logits and responses, and generating tidy CSV files for analysis. Moreover, my ability to translate your Garak's 256 English attack prompts into clear, natural Arabic is an additional asset for this project. Being bilingual, I understand the paramount significance of preserving not only the meaning but also the adversarial intent in each translated prompt. I assure you that the translation quality produced will not weaken the potency of any attack. Lastly, one notable trait of my work is meticulous documentation. For this project, I would document every single step thoroughly in markdown cells, giving you full transparency into my methodology and ensuring reproducibility. My aim is to deliver an end-to-end solution that runs smoothly on a fresh environment and automatically generates CSV result files and an HTML/PDF export of the notebook
$140 USD in 7 days
5.4
5.4

I understand the intense need for technical precision in this project. My extensive background in data analysis and machine learning, particularly in Python, makes me highly adept at using the Hugging Face Transformers library and working with pandas. I possess not only theoretical knowledge but also practical experience with Google Cloud Vision, which involves complex tasks like dealing with large datasets and processing images, a skillset that comes in handy for the dataset we'll be working with. Additionally, my experience ensures that the notebook will seamlessly run on your desired Python version (3.10) and Transformers latest, ensuring easy reproducibility. Furthermore, Arabic posing unique nuances for LLM security contexts calls for an expert who is proficient with both Arabic and the intricacies of Language Models. With fluency in Arabic and strong problem-solving abilities honed over years of multitasking as a programmer, I am confident that I can deliver accurate and coherent translations thus preserving the adversarial intent of each prompt with zero machine-literal phrasing.
$150 USD in 3 days
4.9
4.9

As a bilingual professional with expertise in translation, I believe I would be the ideal candidate to assist you in your Arabic LLM Prompt Injection Testing project. Working with leading entities in the Middle East for 7 years has allowed me to gain mastery over both English and Arabic languages as well as nuance of language and context - a skill vital to translating the Garak prompts correctly. Moreover, my experience in content creation and management for social media platforms positions me strongly for this project. I understand the importance of not only translating but also preserving the adversarial intent of each prompt, so nothing is lost in translation. I have demonstrated this through my copywriting and scriptwriting over the years by taking care that messages remain consistent and powerful across cultural barriers. To further add to my skillset, I am proficient in Python which allows me to adapt quickly with your project requirements. My familiarity with Pandas would earnestly help to culminate a concise, structured, and comprehensive analysis section that meets all your specifications. Let us collaborate on this project and drive it exquisitely from beginning to end; your researching will be ??????????!
$140 USD in 7 days
5.0
5.0

Hi randa41 We went through your project description and it seems like our team is a great fit for this job. Lets connect in chat so that We discuss further. Regards
$70 USD in 1 day
4.0
4.0

✅✅As someone who is well-versed in Hugging Face Transformers, data analysis and Python, I'm confident in my ability to execute this project with precision and efficiency. My expertise in AI development and machine learning puts me in a strong position for implementing the Garak prompt-injection suite, working with multiple models and producing meaningful analysis. One of my key strengths is my deep understanding of the nuances of different languages including Arabic, which I believe is crucial for this particular task. I have a track record of providing precise translations that maintain the intended meaning while being native to the language, ensuring that no adversarial intent is lost in the process. Additionally, my meticulous approach to documentation will ensure that every step of the workflow is properly annotated and reproducible so you can understand and replicate the pipeline without any difficulties. Collaboration and clear communication are paramount to me, and I'm highly committed to delivering high-quality results within agreed timelines. Let's create a secure and robust testing pipeline for your Arabic-capable LLMs together!
$55.55 USD in 7 days
3.5
3.5

Hello there, As an experienced researcher and data scientist, data analyst, my qualitative analysis skills perfectly align with your job requirements. My profound knowledge of Python and R Studio guarantees fast learning and adaptation to new tools. Moreover, my advanced skills in Excel make me highly competent in handling large datasets efficiently—making me proficient in extracting the best insights from your transcripts. I fully comprehend the importance of working papers and meticulously preparing financial statements, especially within strict timelines. my sharp analytical skills and extensive knowledge of excel ensure that I leave no stone unturned in making sure every detail is covered under evaluation. My passion for quality, originality and meeting deadlines makes me an excellent choice for this project. I cannot wait to prove my extensive skills to you through providing actionable insights that will help guide your decision making regarding domestic charter flights. Best Regards
$30 USD in 1 day
4.0
4.0

I’m experienced with Hugging Face Transformers (PyTorch backend), adversarial prompt evaluation, and structured experiment pipelines in Python (pandas, Jupyter, reproducible ML workflows). I can build a fully documented notebook that loads Allam, Falcon, and Fanar, integrates the Garak prompt-injection suite, and runs bilingual attack evaluations with clean CSV logging of logits and full responses. I’ll carefully translate all 256 prompts into natural, high-quality Arabic that preserves adversarial intent, avoiding literal phrasing that could weaken the attack signal. The notebook will include a toggle for English vs. Arabic testing, automated CSV + HTML/PDF export, and a concise analysis section visualizing success rates per model and per language. Everything will be reproducible in a fresh Python 3.10 environment with documented setup steps and evaluation methodology. Ready to get this pipeline implemented end-to-end.
$80 USD in 7 days
3.3
3.3

I understand exactly what you want to achieve with this project building a pipeline to test Arabic prompt injection vulnerabilities using Garak, with high-quality Arabic adversarial prompt translation and clear evaluation results. I have experience in prompt engineering, so I can preserve the attack intent while translating into natural Arabic. I’m comfortable working with LLM workflows, Hugging Face models, and pandas for analysis, and I can handle the full implementation end-to-end. I focus on clean, reproducible notebooks and clear documentation. I’m ready to start quickly — feel free to reach out to discuss details or request a demo.
$155 USD in 7 days
3.5
3.5

Hello there Hope you are doing well. I can build a Python notebook to test Allam, Falcon, and Fanar with the Garak prompt-injection suite, translating all 256 prompts into natural Arabic while preserving adversarial intent. With eight years of experience in Python, Hugging Face Transformers, and LLM security testing, I’ll set up bilingual execution, capture logits and responses, output tidy CSVs, and provide pandas analysis with visualizations of attack success per model and language. The notebook will include full markdown documentation, reproducible methodology, and automatic HTML/PDF export with clear summaries of Arabic vs. English results. I’d love to discuss more based on the requirement—let’s connect. Best regards, Mobasher Reza.
$240 USD in 3 days
3.2
3.2

With 7 years of experience in cybersecurity, I am the best fit to complete the Arabic LLM Prompt Injection Testing project. I have the relevant skills to evaluate security vulnerabilities in Arabic-capable language models like Allam, Falcon, and Fanar using the Garak prompt-injection suite. How I will complete this project: - Build a Python notebook to load the models from Hugging Face with a PyTorch backend. - Translate Garak's 256 English attack prompts into clear, natural Arabic for testing. - Implement bilingual testing for side-by-side comparison of success rates. - Conduct pandas analysis to visualize attack success percentages per model and language. - Document the methodology in markdown cells for reproducibility. Tech stack I will use: - Python 3.10 - Hugging Face Transformers - Pandas I have worked on similar solutions in the past and understand the technical nuances of Arabic prompting in LLM security contexts. The notebook will run end-to-end on a fresh environment, generating CSV result files and an HTML/PDF export automatically. The translation quality will be ensured to preserve the adversarial intent of each prompt. The final section will summarize Arabic vs. English success rates and highlight any model-specific patterns. Let's collaborate to streamline this testing pipeline effectively.
$33 USD in 7 days
2.8
2.8

This is not simply a translation task, it is a controlled adversarial testing pipeline across multilingual model behaviour. The structural risk sits in prompt drift during translation, inconsistent tokenisation between Arabic and English, and non deterministic logging that corrupts comparative results. Without strict reproducibility controls, success rate metrics become unreliable. Comparable LLM evaluation systems required deterministic seed control, structured prompt indexing, and isolated inference loops to prevent cross contamination of results. The solution would define a modular notebook architecture, model loader abstraction for Allam, Falcon, and Fanar, bilingual prompt registry with ID locking, Garak execution wrapper, logits capture layer, and structured CSV output schema. Pandas analysis would compute per model and per language success metrics with controlled visualisation. Full markdown documentation would ensure reproducibility from environment setup to metric interpretation. This ensures methodological integrity and defensible findings. Share the target model versions and hardware constraints, and the execution framework will be structured for stable, end to end validation. Cheers, Lance Full Stack Digital Director
$129 USD in 7 days
2.7
2.7

✅✅As a seasoned software developer with extensive experience in AI and automation, I'm well-versed in exactly the kind of testing your project requires. Throughout my 8+ years in the industry, I've had opportunities to develop AI-powered applications, deal with multilingual prompts in Natural Language Processing contexts, and effectively organize and document complex workflows, all of which would be invaluable in executing your prompt-injection suite. A distinguishing aspect of my work is my meticulous attention to detail. I understand the significance of thorough documentation for reproducibility, particularly when evaluating security vulnerabilities. Accordingly, I'll ensure every step of my strategy, from model acquisition commands to translation methodology, will be clearly documented in markdown cells for a fully traceable project. Another strength I bring to this project is my adaptability with languages and libraries. From Python 3.10 to TensorFlow and Garak, I have expanded my toolkit extensively to encompass the best technology choices over time and remain an efficient problem solver. Additionally, as an avid proponent of efficient communication, you can expect regular updates on the project's progress and clear visibility into potential risks or challenges that may arise throughout its execution.
$55.55 USD in 7 days
2.4
2.4

Hello randa41, I'm Dax Manning, with over 8 years of experience in Python and Machine Learning, specializing in developing efficient solutions for complex projects. I have carefully reviewed your project requirements for evaluating security vulnerabilities in Arabic-capable language models using the Garak prompt-injection suite. I plan to build a Python notebook that seamlessly integrates the three models from Hugging Face, conducts bilingual testing, generates tidy CSV files, and provides a detailed pandas analysis for comprehensive evaluation. I believe my expertise aligns well with the technical aspects of your project. Let's discuss further in the chat to explore how we can collaborate effectively to achieve your goals. Thanks, Dax Manning
$140 USD in 7 days
2.0
2.0

Hi, I would like to grab this opportunity and will work till you get 100% satisfied with my work. I just applied after read your job posting carefully and I believe that I am good fit to your project. I'm a serious bidder. I will satisfy you with my high skills! I am an expert which have 8+ years of experience on Python, Translation, Machine Learning (ML), Data Analysis, Pandas, Prompt Engineering, Natural Language Processing, Hugging Face, AI Development, LLM Integration I am looking forward to meet you to discuss the further detail about this project. Looking forward to hearing from you. Warm Regards
$150 USD in 7 days
1.6
1.6

Riyadh, Saudi Arabia
Payment method verified
Member since Feb 26, 2025
$10-30 USD
₹1500-12500 INR
$15-25 CAD / hour
₹12500-37500 INR
$15-25 USD / hour
$30-250 AUD
$45 USD
$10-30 USD
$750-1500 USD
$30-250 USD
min ₹2500 INR / hour
₹1500-12500 INR
$30-250 USD
$30-250 USD
$40 USD
$2-8 USD / hour
$25 USD
₹37500-75000 INR
$250-750 USD
₹400-750 INR / hour
$10-15 USD