Skip to content

Releases: microsoft/PubSec-Info-Assistant

v1.2

20 Sep 02:39
df055a7
Compare
Choose a tag to compare

What's New

  • GPT-4o model support
  • US Government cloud deployments now support Azure AI Search with semantic ranking which measurably improves search relevance by using language understanding to rank results.
  • Secure mode configuration
    • Includes configuration for scenarios where infrastructure security and privacy are essential, like those in public sector and regulated industries. Key features of “secure mode” include:
      • Disabling public network access: Restrict external access to safeguard public access.
      • Virtual network protection: Deploy your Azure services within a secure virtual network.
      • Private endpoints: The deployed Azure services connect exclusively through private endpoints within a virtual network where available.
      • Data encryption at rest and in transit: Ensure encryption of data when stored and during transmission.
  • Compatible with Microsoft Cloud for Sovereignty

What's Changed

Read more

v1.1.1

29 May 08:29
66e324c
Compare
Choose a tag to compare

What's New

  • Support for migrating or upgrading from v1.0 to v1.1.x. More details can be found in the documentation at Moving from v1.0 to v1.1
  • API Version updates
    • openai=1.17.0
    • langchain=0.1.16
    • langchain-openai=0.1.3
    • tiktoken=0.5.2
    • Azure OpenAI API='2024-02-01'
  • Streaming of responses in the UX. Answers will begin streaming in chats without having to wait on full answer before returning.
  • Support for adding Service Management Reference and multiple owners on Azure Entry object created for Info Assistant. Following recommended security baselines, these new variables can be used to satisfy policy requirements when deploying.

What's Changed

Full Changelog: v1.1...v1.1.1

v1.1

29 Apr 22:15
878475d
Compare
Choose a tag to compare

What's New

  • New Chat Modes that include Work + Web, that includes search the Web (via Bing APIs) and compare and contrast answers between my Work documents and Web search results. Also included is Ungrounded which allows direct interaction with an LLM without grounding to allow for creative and fully generative responses.
  • Enhanced File Management for files loaded into IA to including filtering, resubmit and delete file(s). Plus detailed views on processing status and/or error messages without needing IT to look in a database.
  • SharePoint (document libraries, not lists or websites) as a source configurable to point at multiple sites and/or folders. Documents are copied into IA for processing, therefore RBAC is not maintained. The process supports all CRUD operations for files in configured SharePoint document libaries.
  • Compatible with US Gov sovereign clouds including Azure OpenAI hosted in US Gov regions
  • Documentation for use of Bing Safe Search and Azure OpenAI Content Filtering controls for Azure customers to tailor the content filtering behavior to their needs while aiming to prevent potentially harmful generated content and any copyright violations from public content

Preview Features

  • Assistants/ Autonomous Agents are stateful evolution of traditional Generative AI application where you have to manage conversation state, tool integrations, and execute them manually. Assistants/Autonomous Agents are stateful and they automatically manage tool/ function integrations and conversation state. We are introducing two assistants with this release:
    • Math Assistant. How do you get a Large "Language" Model to reason over order of operations in Math? We have included a tool based on Assistants patterns.
    • Tabular Data Assistant. How do you reason over a set of tabular data that you do not know the schema for to answer complex questions in natural language? We show how when a user asks question in natural language and Tabular Data Assistant generates python code to understand tabular data, reason over it, and generate a response.

Coming Soon

  • Secure Deployment: Updates to ensure all traffic and data is private and encrypted. We will also be compatible with Microsoft's Cloud for Sovereignty Security Baselines, Sovereign Landing Zone Baseline for Online and Corp landing zones.
  • Migrate or Upgrade from v1.0 to v1.1 without the need to reload and reprocess all your files.

Issues Fixes

Fixes #295
Fixes #345
Fixes #467
Fixes #481
Fixes #487
Fixes #493
Fixes #496
Fixes #597

What's Changed

Read more

1.0

12 Jan 19:47
27df390
Compare
Choose a tag to compare
1.0

What's New

Lots of new documentation

Performance

  • Enrichment App metric based autoscaling
  • Function Apps metric based autoscaling

More secure deployment

  • Use of KeyVault in BICEP deployments to avoid MS Defender alerts
  • KeyVault implementation for App Services

Dependency Updates

  • Document Intelligence API Updated to 2023-07-31
  • Updates to NPM packages addressing CVEs

Hotfixes to resolve bugs

Updated bug report templates

  • Improved bug reports

What's Changed

New Contributors

Full Changelog: v0.4-Delta...v1.0

0.4 Delta

14 Nov 19:21
0cc8d71
Compare
Choose a tag to compare

What's New:

  • Vector Hybrid Search which combines vector similarity with keyword matching to enhance search accuracy.
    • Added document processing pipeline steps to generate embeddings for text-based files. Bring your embedding (Azure Open AI or open-source embedding model).
    • Extended document processing pipeline with richer language detection and translation to avoid common error with OOTB Azure Cognitive Search skillsets
    • Switched to direct search index inserts instead of Azure Cognitive Search Indexer and skillsets
    • Restructured and added vector columns to Azure Cognitive Search Index (expanded JSON into index fields)
    • Update UX to embed users query and execute Vector Hybrid Search with Semantic
    • Added pipeline to process images and store enrichments as keywords in Azure Cognitive Search index which would allow user to do text to image search.
  • Added iFrame document and image rendering of source material under citation panel of UX.
  • Added support for several new file types using Unstructured.io
    • Text-based: pdf, docx, html, htm, csv, md, pptx, txt, json, xlsx, xml, eml, msg
    • Images: jpg, jpeg, png, gif, bmp, tif, tiff
  • Added support for US Government deployments
  • Added filtered query support for Azure Cognitive Search index fields
    • Enabled upload to a folder and adding tags to uploaded file in UX
    • Enabled filtering search by "folder" and/or "tags" fields in Adjust panel in UX
  • Added function testing of document pre-processing pipelines and embeddings REST APIs
  • Added branding updates that allow a warning banner and UX title updates
  • Enhanced infrastructure and application logging
    • Detailed chunk-based logging for embeddings and indexing
    • New Azure Workbook to help investigate infrastructure level errors (i.e. App Service not starting up correctly)

What's Changed

Read more

0.3 Gamma

13 Sep 23:40
d16ef65
Compare
Choose a tag to compare

What's New

  • Support for GPT-4 using Chat/Completion APIs (backwards compatible with GPT-3.5-Turbo)
  • New UX "Info" panel to show common configuration values (Azure Open AI, Azure Search, and Language settings)
  • Improved the UX for citation chunks to show as HTML rather than raw JSON
  • Ability to resubmit chat questions using the "Regenerate" button. Regenerating an answer will use the past question with the current "Adjust" settings.
  • Removal of the "Ask a Question" section in the UX
  • CUA (Customer Usage Attribution) enablement
  • Adoption Workshop self-paced learning available
  • Debugging support for the app/frontend Typescript code

Known Issues

See our updated Known Issues list. Please check the Issues board for any issues you encounter.

What's Changed

Full Changelog: v0.2-Beta...v0.3-Gamma

0.2 Beta

17 Jul 09:50
d22b7f4
Compare
Choose a tag to compare

What's New

  • Improved prompt engineering focused on reducing hallucinations and ensuring citation generation.
  • Language specific deployment options. You can now configure the target language of the search index, search skillsets, and prompt engineering.
  • New Content Management view in the web site. This now provides the ability to view the status up uploaded files.
  • Improved upload processing of PDF files. We have been able to increase the "per load" limit of PDFs to ~200 documents or ~4500 pages that can be processed at once.

Known Issues

  • Uploaded PDF files in large batches may get stuck in "Queued" status. If some of your PDF files are stuck in the "Queued" status for more than 30 minutes, simply upload the files again to restart the processing.

What's Changed

  • Geearl/5760 file form rec submission pdf by @georearl in #79
  • Build pipeline for Red/Blue deployment. by @asbanger in #72
  • Geearl/5762 file form rec polling pdf by @georearl in #83
  • Geearl/5797 parser error 2 by @georearl in #87
  • Geearl/5793 document map by @georearl in #89
  • fixed blank folder by @georearl in #91
  • Geearl/5803 pdf chunks by @georearl in #90
  • Add NONE indicator to source list when none are available by @dayland in #92
  • fix bug where check for first large para was blocking smaller chunk ouputs by @dayland in #93
  • add import of nltk and punkt by @dayland in #95
  • fix large paragraph chunking logic by @dayland in #97
  • Geearl/5763 non pdf document map by @georearl in #96
  • extend the pipeline to deploy azure functions by @asbanger in #82
  • add "allowSkillsetToReadFileData" property to search indexer by @dayland in #101
  • Update AnalysisPanel to render DOCX in Office viewer by @dayland in #100
  • Geearl/5814 status complete by @georearl in #99
  • Doccumentation update-developing in a codespace using vscode by @asbanger in #102
  • various changes by @georearl in #103
  • Hallucination Resistance prompt with Chain of Thoughts by @ArpitaisAn0maly in #107
  • adding new pipeline for vNext dev branch by @dayland in #108
  • Dayland/5753 add support for webapp ad security in automation by @dayland in #104
  • fixes to auto deployment for limited permissions by @dayland in #109
  • Adding missed parameter to CI/CD pipeline env file by @dayland in #110
  • support for current file debug by @georearl in #112
  • Adding Content page with File Upload and File Status as sub-pages by @dayland in #114
  • Expand ASP to B3 with 3 nodes for function performance by @dayland in #116
  • Fix sorting on file list and add loading... component by @dayland in #117
  • Fix syntax error on new AOAI deployment by @dayland in #118
  • update build to make "shared_code" dir before copy by @dayland in #119
  • Adding configurable language support by @dayland in #122
  • fix for docker daemon socket permission. by @asbanger in #121
  • Geearl/5857 code error by @georearl in #126
  • aparmar/5751-citation-bug. Code changes to enforce citation lookup di… by @ArpitaisAn0maly in #127
  • Changed to GA FR API Vers and fixed error log by @lmwilki in #128
  • MD documentation - update prerequisites by @asbanger in #129
  • MD documentation - update known issues by @asbanger in #130
  • MD documentation for configuration of local dev environment by @asbanger in #132
  • changed to to dictionary to map value of response length and passed r… by @ArpitaisAn0maly in #131
  • 0.2 Beta Release Candidate by @dayland in #133

Full Changelog: v0.1-Alpha...v0.2-Beta

v0.1-Alpha

19 Jun 22:09
ea2fbe1
Compare
Choose a tag to compare

This is the first Alpha release of the Information Assistant Accelerator. We believe we have achieved enough basic functionality for the accelerator to be deployed and provide these basic features:

  • Chat and Q&A interfaces
  • File Upload and automated chunking and indexing for PDF, HTML, and DOCX
  • Explores various options to help users evaluate the trustworthiness of responses with citations, tracking of source content, etc.
  • Shows possible approaches for data preparation, prompt construction, and orchestration of interaction between model (ChatGPT) and retriever (Cognitive Search)
  • Settings directly in the UX to tweak the behavior and experiment with options