Exporters From Japan

HOME
ABOUT
- RESULTS
- differences
- BENEFITS
- HISTORY
- TEAM
- LOCATION
- FACILITIES
- BANKING
- MEMBERSHIPS
- APPROVALS
- LICENCES
- SUPPLIERS
- SPONSORSHIPS
- MEDIA
- PRIVACY
AUCTIONS
SHIPPING
FEES
- TS REWARDS
TOOLS
guides
FAQ
CONTACT
- CONNECT

VEHICLES
BRAND
- JAPANESE CARS
  - DAIHATSU
  - EUNOS
  - FORD
  - HONDA
  - ISUZU
  - LEXUS
  - MAZDA
  - MITSUBISHI
  - MITSUOKA
  - NISSAN
  - SUBARU
  - SUZUKI
  - TOYOTA
- GERMAN CARS
- AMERICAN CARS
- BRITISH CARS
- ITALIAN CARS
- FRENCH CARS
- SWEDISH CARS
- KOREAN CARS
TYPE
- mobility
- VENDING
- instruction
- TAXIS
- AMBULANCES
- FIRE ENGINES
- HEARSES
- LIMOUSINES
- COMMERCIAL
CLASS
FUEL
TRUCKS
minitrucks
- DAIHATSU
- HONDA
- MAZDA
- MITSUBISHI
- NISSAN
- SUBARU
- SUZUKI
- DUMP
- CRANE
- CAMPER
- REFRIGERATED
- 4WD
- NEW
BUSES
MOTORHOMES
- YAHOO!
- RAKUTEN
- DEALER

PARTS
- FREE REPORT
- PARTS CONTAINERS
- PARTS SYSTEMS
- PARTS PROTECTION
- BODY SHELLS
- DISMANTLING
- ONLINE PARTS
- NEW PARTS
- INTERIOR PARTS
- EXTERIOR PARTS
  - BONNETS
  - BUMPERS
  - GRILLES
  - FENDERS
  - DOORS
  - TRUNKS
  - SPOILERS
  - LIGHTS
  - EMBLEMS
  - CAMERAS
- ENGINES
- TRANSMISSIONS
- WHEELS & TYRES
  - WHEELS
  - TYRES
CUTS
PERFORMANCE PARTS
TRUCK PARTS
MOTORBIKE PARTS
- MOTORBIKE ENGINES
- MOTORBIKE ACCESSORIES

MOTORBIKES
MARINE
FORKLIFTS
MACHINERY
AGRICULTURAL
OTHER
COUNTRY
- AUSTRALIA
- CANADA
- KENYA
- MYANMAR
- NEW ZEALAND
- PAKISTAN
- TANZANIA
- UNITED STATES

CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 date: Tue, 30 Dec 2025 23:04:27 GMT content-type: text/html; charset=utf-8 vary: X-PJAX, X-PJAX-Container, Turbo-Visit, Turbo-Frame, X-Requested-With,Accept-Encoding, Accept, X-Requested-With etag: W/"c97d4647e5d6493ef9a06200be638cab" cache-control: max-age=0, private, must-revalidate strict-transport-security: max-age=31536000; includeSubdomains; preload x-frame-options: deny x-content-type-options: nosniff x-xss-protection: 0 referrer-policy: no-referrer-when-downgrade content-security-policy: default-src 'none'; base-uri 'self'; child-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/; connect-src 'self' uploads.github.com www.githubstatus.com collector.github.com raw.githubusercontent.com api.github.com github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com github-production-upload-manifest-file-7fdce7.s3.amazonaws.com github-production-user-asset-6210df.s3.amazonaws.com *.rel.tunnels.api.visualstudio.com wss://*.rel.tunnels.api.visualstudio.com github.githubassets.com objects-origin.githubusercontent.com copilot-proxy.githubusercontent.com proxy.individual.githubcopilot.com proxy.business.githubcopilot.com proxy.enterprise.githubcopilot.com *.actions.githubusercontent.com wss://*.actions.githubusercontent.com productionresultssa0.blob.core.windows.net/ productionresultssa1.blob.core.windows.net/ productionresultssa2.blob.core.windows.net/ productionresultssa3.blob.core.windows.net/ productionresultssa4.blob.core.windows.net/ productionresultssa5.blob.core.windows.net/ productionresultssa6.blob.core.windows.net/ productionresultssa7.blob.core.windows.net/ productionresultssa8.blob.core.windows.net/ productionresultssa9.blob.core.windows.net/ productionresultssa10.blob.core.windows.net/ productionresultssa11.blob.core.windows.net/ productionresultssa12.blob.core.windows.net/ productionresultssa13.blob.core.windows.net/ productionresultssa14.blob.core.windows.net/ productionresultssa15.blob.core.windows.net/ productionresultssa16.blob.core.windows.net/ productionresultssa17.blob.core.windows.net/ productionresultssa18.blob.core.windows.net/ productionresultssa19.blob.core.windows.net/ github-production-repository-image-32fea6.s3.amazonaws.com github-production-release-asset-2e65be.s3.amazonaws.com insights.github.com wss://alive.github.com wss://alive-staging.github.com api.githubcopilot.com api.individual.githubcopilot.com api.business.githubcopilot.com api.enterprise.githubcopilot.com; font-src github.githubassets.com; form-action 'self' github.com gist.github.com copilot-workspace.githubnext.com objects-origin.githubusercontent.com; frame-ancestors 'none'; frame-src viewscreen.githubusercontent.com notebooks.githubusercontent.com; img-src 'self' data: blob: github.githubassets.com media.githubusercontent.com camo.githubusercontent.com identicons.github.com avatars.githubusercontent.com private-avatars.githubusercontent.com github-cloud.s3.amazonaws.com objects.githubusercontent.com release-assets.githubusercontent.com secured-user-images.githubusercontent.com/ user-images.githubusercontent.com/ private-user-images.githubusercontent.com opengraph.githubassets.com marketplace-screenshots.githubusercontent.com/ copilotprodattachments.blob.core.windows.net/github-production-copilot-attachments/ github-production-user-asset-6210df.s3.amazonaws.com customer-stories-feed.github.com spotlights-feed.github.com objects-origin.githubusercontent.com *.githubusercontent.com; manifest-src 'self'; media-src github.com user-images.githubusercontent.com/ secured-user-images.githubusercontent.com/ private-user-images.githubusercontent.com github-production-user-asset-6210df.s3.amazonaws.com gist.github.com github.githubassets.com; script-src github.githubassets.com; style-src 'unsafe-inline' github.githubassets.com; upgrade-insecure-requests; worker-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/ server: github.com content-encoding: gzip accept-ranges: bytes set-cookie: _gh_sess=cvMamdC8GyNETBitg2kKGpYOqoG00x0JEDJo83qdg%2FTEenhznri%2BFB5hBqDEskCMFP1f%2FMGiQ%2FRf%2BWK8ruSzm4ZBm6o8F6DAgZklckmS5dRIW70J3FwXpxcx7DBvbdGjJW00CVrx0ae6XDgTGADCXfdySTml2FZPr43VhgQRWAZBWfwSysqGsUjW8mfvhSJ0lFimnjlCi6Un9vRs8EEZfvZhTkWmeuzdNmSvcGBi%2B6NChwZeB2CwwkajxeOtRISGlUvKjwPCzW%2FsG97U2c8x1Q%3D%3D--Lo76NQ67x6V6nRSY--1zy%2FgL7fMgmFp53db5kQNw%3D%3D; Path=/; HttpOnly; Secure; SameSite=Lax set-cookie: _octo=GH1.1.522432070.1767135866; Path=/; Domain=github.com; Expires=Wed, 30 Dec 2026 23:04:26 GMT; Secure; SameSite=Lax set-cookie: logged_in=no; Path=/; Domain=github.com; Expires=Wed, 30 Dec 2026 23:04:26 GMT; HttpOnly; Secure; SameSite=Lax x-github-request-id: AC7A:2E7100:117702A:1391280:69545A7A GitHub - ColiLea/scan

Skip to content

Navigation Menu

Appearance settings

View all features
- BY COMPANY SIZE
  Enterprises
  Small and medium teams
  Startups
  Nonprofits
- BY USE CASE
  App Modernization
  DevSecOps
  DevOps
  CI/CD
  View all use cases
- BY INDUSTRY
  Healthcare
  Financial services
  Manufacturing
  Government
  View all industries
View all solutions
- EXPLORE BY TOPIC
  AI
  Software Development
  DevOps
  Security
  View all topics
- EXPLORE BY TYPE
  Customer stories
  Events & webinars
  Ebooks & reports
  Business insights
  GitHub Skills
- SUPPORT & SERVICES
  Documentation
  Customer support
  Community forum
  Trust center
  Partners
- COMMUNITY
  GitHub SponsorsFund open source developers
- PROGRAMS
  Security Lab
  Maintainer Community
  Accelerator
  Archive Program
- REPOSITORIES
  Topics
  Trending
  Collections
- ENTERPRISE SOLUTIONS
  Enterprise platformAI-powered developer platform
- AVAILABLE ADD-ONS
  GitHub Advanced SecurityEnterprise-grade security features
  Copilot for BusinessEnterprise-grade AI features
  Premium SupportEnterprise-grade 24/7 support
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Appearance settings

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

ColiLea / scan Public

Notifications You must be signed in to change notification settings
Fork 1
Star 6

6 stars 1 fork Branches Tags Activity

Notifications

You must be signed in to change notification settings

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

ColiLea/scan

Open more actions menu

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README		README
main.zip		main.zip
other_lib.zip		other_lib.zip
scan_lib.zip		scan_lib.zip

Repository files navigation

README

===========================
======    README    =======
===========================
This package contains the code underlying the SCAN model as described in 
Frermann and Lapata (2016). The code can be used to (a) create binary files
containing time-stamped input documents (the required input to SCAN); and (b)
to train scan models.
This bundle contains three parts
(1) main.zip  
    the main functions. Run code from this directory. Also contains example
    input and output
(2) scan_lib.zip
    the core code underlying SCAN (my code)
(3) other_lib.zip
    libraries that are used but not part of the core go libraries. Each of these must 
    be placed in the same directory as scan_lib.zip
To run the code, navigate into your 'main' directory and run 
EITHER the pre-compiled binary (should work out of the box)
./dynamic-senses -parameter_file=/path/to/parameters/ [-create_corpus] -store={true,false}
OR the code itself (requires to have go installed)
go run *.go -parameter_file=/path/to/parameters/ [-create_corpus] -store={true,false}
The command-line parameters are:
- parameter_file 
    path to a text file containing all parameters (see below)
    
-create_corpus 
    an optional parameter. If it is set a binary corpus will be created. Otherwise a model will be trained
    
-store 
    indicates whether target word-specific corpora should be stored
   
To run the code itself (second option above i.e., not just the binary) golang (https://golang.org/doc/) 
must be installed. I use version go1.7.4 (might be safest to use the same). All required 
packages to run this code should be part of this bundle.
   
   
----------------------------------
----- running out of the box -----
----------------------------------
I include an example for all necessary test files:
- a parameters.txt (which has my own hard-coded paths, MUST BE CHANGED)
- corpus file under main/test_input/corpus.txt
- a file with target words under main/test_input/targets.txt
Change the paths in the parameters.txt to your own paths (to the corpus / targets fies), 
and run
(b) to create a binary corpus
./dynamic_senses -parameter_file=path/to/parameters.txt -create_corpus -store=true 
(a) to train models
./dynamic_senses -parameter_file=path/to/parameters.txt -store=true    
   
   
   
---------------------------
--- the parameters file ---
---------------------------
All parameters are specified in a parameters which must be passed as input to
the program. Best see the included 'parameters.txt' examples file. This includes
- paths to underlying text corpora, target word sets, etc
- model parameters (see paper for explanation)
- sampler parameters (number of iterations)
- parameters regarding the time start/end/intervals of interest
- parameters to optionally restrict the minimum number of available documents (to ignore highly
  infrequent words) and / or maximum number of available documents per time interval
  (to get managable-size corpora for very high-frequent words)
------------------------------
Creating binary input corpora
------------------------------
Takes a text file and a list of target words, and a 'document length' specification. Outputs
a binary corpus with target-word specific, time-stamped documents. Document length refers to
the size of the context window considered around the target word, e.g., 5 words. 
[The corpus has words mapped to unique ID identifiers, and contains dictionaries mapping from 
 word strings to IDs and back]
It takes the following parameters (all specified in parameters.txt)
 
- text_corpus
    path to a text file which in each line contains a number indicating
    a year (of origin) followed by a \tab\ character followed by the corresponding text 
    from that year of origin. The same year can be listed multiple times:
    YEAR \tab\ text ....
    YEAR \tab\ text ....
    ....
- target_words
    path to a text file containing all target words of interest
    whose meaning should be tracked, one word per line. 
- window_size
    It also requires a specification of the window size (i.e, the context window to consider, as
    explained above)
- bin_corpus_store
    path to the location the binary corpus should be stored 
------------------------------
    Training a SCAN model 
------------------------------
Once we have a binary corpus of time-stamped documents as explained above full_corpus_path, we can train SCAN
models that track meaning change of individual target words. To do this we
(1) extract a target word-specific corpus from the underlying binary corpus. It contains only time-tagged 
    documents with the specified target word. It converts the absolute times in the underlying corpus (e.g, 1764, 1993, ...)
    to time intervals (0, 1, ..., T) based on the start_time, end_time and time_interval parameters (see below). 
    It takes the following parameters (all specified in parameters.txt):
    - full_corpus_path
        path to the underlying binary corpus
        
    - start_time
        the earliest time stamp in the underlying corpus to be considered
        
    - end_time
        the latest time stamp in the underlying corpus to be considered
        
    - time_interval
        the length of time intervals into which the span [start_time, end_time] is to be split
      
        e.g., if start_time=1700 , end_time=2000 , time_interval=10 then documents are binned into 10-year bins
        and all documents from before 1700 and after 2000 are ignored. Documents from 1700-1709 are assigned to
        bin 0, documents from 1710-1719 are assigned to bin 1 and so on.
    
    - word_corpus_path
        path to a location where word-specific corpora are stored. 
        
        The filename reflects the choice of start_time / end_time / time_interval, 
        e.g., corpus_s1700_e2009_i10.bin for the example above
    
    
(2) pass this corpus to the model and train the model with MCMC inference. It creates a model and human-readable output:
    - model.bin   the trained model binary and 
    - output.dat  human-readable model output, namely for each time slice its distribution over senses, 
                  and for each sense in each time slice, it's distribution over words (as the set of most 
                  highly  associated words).
    It takes the following parameters (all specified in parameters.txt)
    
    - output_path
        directory in which files output.dat and model.bin are to be stored
        If the output directory and files (a) and (b) already exist (from a previous run) it moves the old files
        to output_old.dat and model_old.bin
    - kappaF, kappaK, a0, b0, num_top
        model parameters; check paper (or ask me!) for explanations
        
    - iterations
        number of training iterations
        
    - min_doc_per_word
        the model doesn't work well if there are very few documents for a target word available. You may want to
        only learn models per target words that occur at least N (~ 500?) times in the data
    
    - max_docs_per_slice
        some words occur extremely often. To get a managable size input corpus you can restrict the number of documents
        to consider per time interval
        
        
        
-------------------------------
  Understanding the output
-------------------------------
The program creates a human-readable output file for each target word in [outptut_path]/word/output.dat
The included directories under main/test_input/output/ contain output for models trained on the corpus in 
main/test_input/corpus.txt for the target words in main/test_input/targets.txt
Models were trained to learn
  -- from the corpus.txt file (containing some language from between 1700 and 2009)
  -- K=8 senses per word
  -- Start_time=1700, end_time=2009, intervals=20 --> obtaining 16 20-year time intervals in total
In each output.dat file:
  -- K indexes the sense ID
  -- T indexes the time slice ID
  -- After each sense / time: The top 10 words with highest probability under each sense are listed
  -- The bottom line of each block (p(w|s)....) sums over all senses, i.e., shows the most highly associated words 
     for a particular time, ignoring sense information.
     
The file output.dat contains the same information in two ways.
*** per type***
First, I list *by sense* the representation of the 
same sense (k=0...K) for each time slice. The first number indicates the sense's prevalence at that par-
ticular time (as a probability, between 0 and 1). Look for example in main/example_input/output/power/output.dat. 
We can see that sense 2 (the block with K=2) seems to relate to the 'electricity' sense of power (it has
a highly associated words 'battery', 'plant', etc especially towards later times, and tts prevalence increases 
towards later time slices.
*** per time ***
This lists the senses associated with each time interval (the content of the lines is the same as above). E.g., 
the times associated with T=0 (first block under 'per_time') shows that senses K=7 has high probablility 
and sense K=2 (the 'power' sense including e.g., the word 'dynamo') has low probability. Senses K=2 refers to 
the 'mental' power.
Output for the targets 'battery' and 'transport' is also included.

About

No description, website, or topics provided.

Resources

Uh oh!

There was an error while loading. Please reload this page.

Stars

Watchers

Forks

Report repository

Releases

No releases published

Packages

No packages published

Uh oh!

There was an error while loading. Please reload this page.

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Community
Docs
Contact

You can’t perform that action at this time.

HOME
ABOUT
AUCTIONS
SHIPPING
FEES
TOOLS
HOW
FAQ
CONTACT

Original Source | Taken Source