Tag Archives: Ruby

Canadian Securities Administrators – Disciplined Persons

Playing around from where I left off last time, the Canadian Securities Administrators (“CSA”) – Disciplined Persons site has loads of information available about disciplined registrants (those registered with securities commissions in one of the 10 provinces and 3 territories within Canada and licensed to sell to and advise Canadians in regards to securities products).

The data goes back a bunch of years, but short of ‘hunting-and-pecking’, name-by-name, the site does not give any overviews on what is going on with ‘Disciplined Persons’. Aggregating the data using a Ruby crawler and “R” to graph the result offers a better view.

Here’s a Treemap showing disciplinary orders issued by various provincial commissions\court:

Disciplined Persons

 

The next Treemap shows the aggregate data for the penalties meted out by Canadian securities regulators and courts in connection with orders relating securities regulations infractions :

CSA Penalties

Take special note of 41 disciplined-persons penalized with imprisonment!

 

Here’s the code to crawl the data:

require 'nokogiri'
require 'rubygems'
require 'watir-webdriver'
browser = Watir::Browser.new :chrome
browser.goto 'https://www.securities-administrators.ca/disciplinedpersons.aspx?id=74'
seen = Array.new
('A'..'Z').each do |letter|
browser.link(:text => letter).click
 browser.table(:id => 'ctl00_bodyContent_gv_list').wait_until_present
loop do
 
 table_Rows = browser.table(:id, 'ctl00_bodyContent_gv_list').rows.length
for i in 1..table_Rows.to_i-1 
 name = browser.table(:id, 'ctl00_bodyContent_gv_list')[i][0].text
 date_of_order = browser.table(:id, 'ctl00_bodyContent_gv_list')[i][1].text
 
 browser.link(:text, name).click
 browser.span(:id => 'ctl00_bodyContent_lbl_persondetail').wait_until_present
discipline_table_Rows = browser.table(:id, 'ctl00_bodyContent_DataList1').rows.length
 for t in 0..discipline_table_Rows.to_i - 1
 
 doc = Nokogiri::HTML.parse(browser.html)
 
 
 t_child = t.to_i + 1
 
 
 if t < 1 && discipline_table_Rows < 2 ### Only one finding ###
 date_of_order = doc.css('#ctl00_bodyContent_DataList1 > tbody > tr > td > table > tbody > tr:nth-child(1) > td').text
 $regulator = browser.table(:id, 'ctl00_bodyContent_DataList1').span(:id, 'ctl00_bodyContent_DataList1_ctl00_lbl_juridiction').div.text
 payment = doc.css('#ctl00_bodyContent_DataList1_ctl00_tr_SanctionList+ tr td').text
 sanctions = doc.css('#ctl00_bodyContent_DataList1_ctl00_lbl_sanction div').text
 violations = doc.css('#ctl00_bodyContent_DataList1_ctl00_lbl_violation div').text
 elsif t < 2 ### More than one finding ###
 date_of_order = doc.css('#ctl00_bodyContent_DataList1 > tbody > tr:nth-child('+t_child.to_s+') > td > table > tbody > tr:nth-child(1) > td').text
 $regulator = browser.table(:id, 'ctl00_bodyContent_DataList1').span(:id, 'ctl00_bodyContent_DataList1_ctl00_lbl_juridiction').div(:class, 'sectiontitle').text
 payment = doc.css('#ctl00_bodyContent_DataList1_ctl0'+t.to_s+'_tr_SanctionList+ tr td').text
 sanctions = doc.css('#ctl00_bodyContent_DataList1_ctl0'+t.to_s+'_lbl_sanction div').text
 violations = doc.css('#ctl00_bodyContent_DataList1_ctl0'+t.to_s+'_lbl_violation div').text
 else
 date_of_order = doc.css('#ctl00_bodyContent_DataList1 > tbody > tr:nth-child('+t_child.to_s+') > td > table > tbody > tr:nth-child(1) > td').text
 if browser.table(:id   

generic levitra

, 'ctl00_bodyContent_DataList1').span(:id, 'ctl00_bodyContent_DataList1_ctl0'+t.to_s+'_lbl_juridiction').div(:class, 'sectiontitle').exists? $regulator = browser.table(:id, 'ctl00_bodyContent_DataList1').span(:id

ogrish forum

, 'ctl00_bodyContent_DataList1_ctl0'+t.to_s+'_lbl_juridiction').div(:class, 'sectiontitle').text else end payment = doc.css('#ctl00_bodyContent_DataList1_ctl0'+t.to_s+'_tr_SanctionList+ tr td').text sanctions = doc.css('#ctl00_bodyContent_DataList1_ctl0'+t.to_s+'_lbl_sanction div').text violations = doc.css('#ctl00_bodyContent_DataList1_ctl0'+t.to_s+'_lbl_violation div').text end if (seen).include?("#{name}\t#{t}\t#{date_of_order.strip()}\t#{$regulator}\t#{payment.strip()}\t#{sanctions}\t#{violations}") else puts "#{name}\t#{t}\t#{date_of_order.strip()}\t#{$regulator}\t#{payment.strip()}\t#{sanctions}\t#{violations}" end seen << "#{name}\t#{t}\t#{date_of_order.strip()}\t#{$regulator}\t#{payment.strip()}\t#{sanctions}\t#{violations}" end

 browser.back
 browser.span(:id => 'ctl00_bodyContent_lbl_browse').wait_until_present
 
 end
break if browser.div(:id, 'ctl00_bodyContent_pnl_pager_top').text !~ /\d+\.\.\./
 browser.link(:id, 'ctl00_bodyContent_lbtnNext').click
 sleep (2)
 browser.table(:id => 'ctl00_bodyContent_gv_list').wait_until_present
end
end

 

FATCA Unites the World!

Previously I parsed data (from the  FATCA – Archive) and graphed those countries with Intergovernmental Agreements (“IGAs”) in place which covered all Financial Institutions within those jurisdictions. Foreign Financial Institutions (“FFIs”) in countries that do not have IGAs in effect are required to enter into FATCA Agreements directly with the Internal Revenue Service or else face potentially catastrophic 30% withholding on amounts received from entities within IGA countries or from participating FFIs.

The result is the following world map view created by mashing data from here together with data from here and graphing with “R“.

A FATCA World

FATCA truly has brought the world ‘together’ (Iran

, Syria and Somalia currently excepted; although I wonder if these two GIINs are registration errors as they claim a source within the Democratic People’s Republic of Korea [AKRJ7L.00000.SP.408 and 9K5UN9.00000.BR.408]?!).

Robert Wood

ohne-rezeptkaufen.com

, over at Forbes, has an excellent article detailing some of the surprising alignments FATCA has generated.

 

Compliance Graphing with “R” – Part II – CSA Registrants

Previously I had experimented with grabbing data from FINRA BrokerCheck and graphing output with “R”.

This time around the task was to strip data from the Canadian Securities Administrators (“CSA”) – National Registration Database (“NRD”) which lists all Canadian registrants and to create a Treemap to graphically represent the CSA registrants for a group of entities belonging to a Financial Institution.
BMO FG - CSA Registrants by Entity

 

The CSA Registrant data was collected using the following Ruby script:

require 'nokogiri'
require 'rubygems'
require 'watir-webdriver'
browser = Watir::Browser.new :chrome
browser.goto 'https://www.securities-administrators.ca/nrs/nrsearch.aspx?id=850'
browser.text_field(:name => 'ctl00$bodyContent$txtFirmName').set 'BMO'
 browser.input(:name => 'ctl00$bodyContent$ibtnSearch').click
 browser.select_list(:name 

buykamagrausa.com

perfect money problems

, 'ctl00$bodyContent$list_num_per_page').select_value("100") sleep 10 loop do table_Rows = browser.table(:id, 'ctl00_bodyContent_gvIndividuals').rows.length for i in 2..table_Rows.to_i-1 name = browser.table(:id , 'ctl00_bodyContent_gvIndividuals')[i][0].text firm = browser.table(:id, 'ctl00_bodyContent_gvIndividuals')[i][1].text if firm =~ /BMO/ && firm !~ /\(INDIVIDUAL NO LONGER REGISTERED\)/ puts "#{name}\t#{firm}" end end browser.link(:id => 'ctl00_bodyContent_lbtnNext2').click sleep 10 end

FATCA – Intergovernmental Agreement (IGA) Countries Graphed

There is lots going on following the July 1, 2014 implementation date for key aspects of the Foreign Account Tax Act (“FATCA”). Many countries have negotiated Intergovernmental Agreements (“IGAs”) to provide for the ‘smoother’ domestic application of this U.S requirement.

The following graphic was generated with data parsed from the FATCA – Archive and it shows the countries with IGAs in place or close to conclusion

, and was graphed using “R”.
FATCA IGA Countries
The data was parsed from the FATCA – Archive using Ruby and the Mechanize gem as follows:

require 'rubygems'
require 'mechanize'
require 'open-uri'

agent = Mechanize.new
page = agent.get("http://www.treasury.gov/resource-center/tax-policy/treaties/Pages/FATCA-Archive.aspx")
 doc = Nokogiri::HTML(page.body)
 doc.css('#ctl00_PlaceHolderMain_ctl05__ControlWrapper_RichHtmlField div div a').each do |linkz|

if linkz.text =~ /\A(.*?)(\s\(.*?\))\Z/
 puts $1 + " 

2pharmaceuticals.com

,Signed IGA" end end doc = Nokogiri::HTML(page.body) doc.css('#ctl00_PlaceHolderMain_ctl05__ControlWrapper_RichHtmlField div p~ div li').each do |linkz| if linkz.text =~ /\A(.*?)(\s\(.*?\))\Z/ puts $1 + ",Substantive IGA" end end

SEDI Issuers

I’m still playing around with the ‘irrelevant’ in an effort to stay ‘relevant’!

Back to the SEDI site for some more inspiration and this time to strip-out a listing of all (7,236) Canadian issuers listed here. You could bash-around here, working through the alphabet and numerals to get at the data; but there is a better way!

No need for Watir this time as I am able to bypass the javascript elements and use just Mechanize and Nokogiri Gems (v. addictive!) for a lightening-fast strip-and-parse from this otherwise ‘clunky’ site.

 

require 'nokogiri'
require 'mechanize'
require 'open-uri'
agent = Mechanize.new
ary_of_firms = Array.new
rows = Array.new
 ## Defines Array Class for HTML Table output ###
class Array 
 def to_cells(tag)
 self.map { |c| "<#{tag}>#{c}</#{tag}>" }.join
 end
end
file = File.open('./SEDI_Issuers.html' 

paypal prepaid debit

, "w")
[*('A'..'Z'),*('0'..'9')].each do |letter|
 page = agent.get('https://www.sedi.ca/sedi/SVTSelectSediIssuer?menukey=15.02.00&locale=en_CA')
sedi_form = page.form('form1')

 sedi_form.ISSUER_NAME = letter

 button = sedi_form.button_with(:value => "Search")

 page = agent.submit(sedi_form, button)

 doc = Nokogiri::HTML(page.body)

 i = 0

 table = doc.css('table')[9]

 #td:nth-child(3) font

 table.css('td:nth-child(3) font').each do |firm|
 if i > 0
 ary_of_firms << firm.text.strip
 end
 i+=1
 end
end
ary_of_firms.sort.uniq.each do |companies|
 puts companies
 rows << {"SEDI Issuer" => companies}
end
### Rolls HTML Table output ###
headers = "<tr>#{rows[0].keys.to_cells('th')}</tr>"
cells = rows.map do |row|
 "<tr>#{row.values.to_cells('td')}</tr>"
end.join("\n ")
table = "<table border=\"1\">
 #{headers}
 #{cells}
</table>"
file.puts table

Buy Nolvadex UK

FINRA List of Members – A “True” Listing of FINRA Members

My last post scraping data of FINRA registrants for a particular Financial Institution got me thinking about getting a listing of all the firms that FINRA regulates.

FINRA does provide a site with a “listing” of all the firms they regulate

, but I wanted all the raw data for use in a spreadsheet (or where ever) and this was the challenge.

This was a great opportunity to use the Ruby Mechanize Gem alongside the Nokogiri Gem for parsing the output. Together these two Gems are very powerful and both crawl and mine data with beautiful efficiency

buy kamagra

perfect money

, grabbing the desired data and getting me my “true” list of all FINRA Members.

require 'nokogiri'
require 'mechanize'
require 'open-uri'
ary_of_members = Array.new
rows = Array.new
agent = Mechanize.new
### Defines Array Class for HTML Table output ###
 class Array
 def to_cells(tag)
  self.map { |c| "<#{tag}>#{c}" }.join
 end
 end
file = File.open('./finra_members.html', "w")
page = agent.get('http://www.finra.org/AboutFINRA/MemberFirms/ListOfMembers/p012909')
doc = Nokogiri::HTML(page.body)
doc.css('.FNRW_Alphabetical_DL-result').each do |linkz|
page = agent.get('http://www.finra.org' + linkz['href'])
 doc = Nokogiri::HTML(page.body)
##### Search for nodes by css
doc.css('#col2cont span').each do |firm|
 if firm.to_s =~ /Mailing Address|10pt'>.*?<\/p><\/span>/

 else
 if firm.text =~ /\w+/
  ary_of_members << firm.text
 end
end
end
end
(0..ary_of_members.length).step(2) do |n|
 rows << {"FINRA Member" => ary_of_members[n], "FINRA Member Address" => ary_of_members[n+1]}
end
### Rolls HTML Table output ###
headers = "#{rows[0].keys.to_cells('th')}"
cells = rows.map do |row|
"#{row.values.to_cells('td')}"
end.join("\n ") 
table = "#{headers} #{cells}"
file.puts table

Quick Query for FINRA BrokerCheck

The task was to generate a listing of active FINRA registrants for a particular FI form FINRAs BrokerCheck website and once again some Ruby script with a Watir and Nokogiri Gem was the goto.

 

require 'nokogiri'
require 'rubygems'
require 'watir-webdriver'
rows = Array.new
dupes = Array.new
### Defines Array Class for HTML Table output ###
class Array 
 def to_cells(tag)
 self.map { |c| "<#{tag}>#{c}</#{tag}>" }.join
 end
end

browser = Watir::Browser.new :chrome
browser.goto 'http://brokercheck.finra.org/Search/Search.aspx'
[*('A'..'Z')].each do |letter|

browser.text_field(:name => 'ctl00$phContent$ucUnifiedSearch$txtIndvl').set "#{letter}" + "*"

browser.text_field(:name => 'ctl00$phContent$ucUnifiedSearch$txtFirm').set 'BMO'

browser.input(:name => 'ctl00$phContent$ucUnifiedSearch$lbtnFreeFormSearch').click
loop do
table_Rows = browser.table(:id 

Köp Kamagra Receptfritt

, 'ctl00_phContent_gvBrokerTable').rows.length
for i in 0..table_Rows.to_i-1 

 doc = Nokogiri::HTML.parse(browser.html)

 name = browser.table(:id  , 'ctl00_phContent_gvBrokerTable')[i][0].div(:class, 'gvListItemStyle').span.text
 lic_status = browser.table(:id, 'ctl00_phContent_gvBrokerTable')[i][1].text
 status = browser.table(:id, 'ctl00_phContent_gvBrokerTable')[i][0].div(:class, 'gvListItemStyle').text

 if lic_status.to_s =~ /Not Licensed/
elsif status.to_s =~ /BMO/

 source = doc.css('.GrayTextShade:nth-child(3)')[i].text

 if dupes.include?(name)
 else
 puts "#{name}\t#{source}"
 rows << {"Name" => name, "Registration" => source}
 dupes << name
 end
end
end
 if browser.link(:id =>'ctl00_phContent_navPager_lbNext').exists?
browser.link(:id =>'ctl00_phContent_navPager_lbNext').click

 else

 browser.goto 'http://brokercheck.finra.org/Search/Search.aspx'
 break
end
end
end
### Rolls HTML Table output ###
headers = "<tr>#{rows[0].keys.to_cells('th')}</tr>"
cells = rows.map do |row|
 "<tr>#{row.values.to_cells('td')}</tr>"
end.join("\n ")
table = "<table border=\"1\">
 #{headers}
 #{cells}
</table>"
puts table