Canadian Securities Administrators – Disciplined Persons

Playing around from where I left off last time, the Canadian Securities Administrators (“CSA”) – Disciplined Persons site has loads of information available about disciplined registrants (those registered with securities commissions in one of the 10 provinces and 3 territories within Canada and licensed to sell to and advise Canadians in regards to securities products).

The data goes back a bunch of years, but short of ‘hunting-and-pecking’, name-by-name, the site does not give any overviews on what is going on with ‘Disciplined Persons’. Aggregating the data using a Ruby crawler and “R” to graph the result offers a better view.

Here’s a Treemap showing disciplinary orders issued by various provincial commissions\court:

Disciplined Persons

 

The next Treemap shows the aggregate data for the penalties meted out by Canadian securities regulators and courts in connection with orders relating securities regulations infractions :

CSA Penalties

Take special note of 41 disciplined-persons penalized with imprisonment!

 

Here’s the code to crawl the data:

require 'nokogiri'
require 'rubygems'
require 'watir-webdriver'
browser = Watir::Browser.new :chrome
browser.goto 'https://www.securities-administrators.ca/disciplinedpersons.aspx?id=74'
seen = Array.new
('A'..'Z').each do |letter|
browser.link(:text => letter).click
 browser.table(:id => 'ctl00_bodyContent_gv_list').wait_until_present
loop do
 
 table_Rows = browser.table(:id, 'ctl00_bodyContent_gv_list').rows.length
for i in 1..table_Rows.to_i-1 
 name = browser.table(:id, 'ctl00_bodyContent_gv_list')[i][0].text
 date_of_order = browser.table(:id, 'ctl00_bodyContent_gv_list')[i][1].text
 
 browser.link(:text, name).click
 browser.span(:id => 'ctl00_bodyContent_lbl_persondetail').wait_until_present
discipline_table_Rows = browser.table(:id, 'ctl00_bodyContent_DataList1').rows.length
 for t in 0..discipline_table_Rows.to_i - 1
 
 doc = Nokogiri::HTML.parse(browser.html)
 
 
 t_child = t.to_i + 1
 
 
 if t < 1 && discipline_table_Rows < 2 ### Only one finding ###
 date_of_order = doc.css('#ctl00_bodyContent_DataList1 > tbody > tr > td > table > tbody > tr:nth-child(1) > td').text
 $regulator = browser.table(:id, 'ctl00_bodyContent_DataList1').span(:id, 'ctl00_bodyContent_DataList1_ctl00_lbl_juridiction').div.text
 payment = doc.css('#ctl00_bodyContent_DataList1_ctl00_tr_SanctionList+ tr td').text
 sanctions = doc.css('#ctl00_bodyContent_DataList1_ctl00_lbl_sanction div').text
 violations = doc.css('#ctl00_bodyContent_DataList1_ctl00_lbl_violation div').text
 elsif t < 2 ### More than one finding ###
 date_of_order = doc.css('#ctl00_bodyContent_DataList1 > tbody > tr:nth-child('+t_child.to_s+') > td > table > tbody > tr:nth-child(1) > td').text
 $regulator = browser.table(:id, 'ctl00_bodyContent_DataList1').span(:id, 'ctl00_bodyContent_DataList1_ctl00_lbl_juridiction').div(:class, 'sectiontitle').text
 payment = doc.css('#ctl00_bodyContent_DataList1_ctl0'+t.to_s+'_tr_SanctionList+ tr td').text
 sanctions = doc.css('#ctl00_bodyContent_DataList1_ctl0'+t.to_s+'_lbl_sanction div').text
 violations = doc.css('#ctl00_bodyContent_DataList1_ctl0'+t.to_s+'_lbl_violation div').text
 else
 date_of_order = doc.css('#ctl00_bodyContent_DataList1 > tbody > tr:nth-child('+t_child.to_s+') > td > table > tbody > tr:nth-child(1) > td').text
 if browser.table(:id    , 'ctl00_bodyContent_DataList1').span(:id, 'ctl00_bodyContent_DataList1_ctl0'+t.to_s+'_lbl_juridiction').div(:class, 'sectiontitle').exists?
 $regulator = browser.table(:id, 'ctl00_bodyContent_DataList1').span(:id  , 'ctl00_bodyContent_DataList1_ctl0'+t.to_s+'_lbl_juridiction').div(:class, 'sectiontitle').text
 else
 end
 payment = doc.css('#ctl00_bodyContent_DataList1_ctl0'+t.to_s+'_tr_SanctionList+ tr td').text
 sanctions = doc.css('#ctl00_bodyContent_DataList1_ctl0'+t.to_s+'_lbl_sanction div').text
 violations = doc.css('#ctl00_bodyContent_DataList1_ctl0'+t.to_s+'_lbl_violation div').text
 end
 
 
 
 if (seen).include?("#{name}\t#{t}\t#{date_of_order.strip()}\t#{$regulator}\t#{payment.strip()}\t#{sanctions}\t#{violations}")
 
 else
 puts "#{name}\t#{t}\t#{date_of_order.strip()}\t#{$regulator}\t#{payment.strip()}\t#{sanctions}\t#{violations}"
 end
 
 seen << "#{name}\t#{t}\t#{date_of_order.strip()}\t#{$regulator}\t#{payment.strip()}\t#{sanctions}\t#{violations}"
 
 end

 browser.back
 browser.span(:id => 'ctl00_bodyContent_lbl_browse').wait_until_present
 
 end
break if browser.div(:id, 'ctl00_bodyContent_pnl_pager_top').text !~ /\d+\.\.\./
 browser.link(:id, 'ctl00_bodyContent_lbtnNext').click
 sleep (2)
 browser.table(:id => 'ctl00_bodyContent_gv_list').wait_until_present
end
end

 

FATCA Unites the World!

Previously I parsed data (from the  FATCA – Archive) and graphed those countries with Intergovernmental Agreements (“IGAs”) in place which covered all Financial Institutions within those jurisdictions. Foreign Financial Institutions (“FFIs”) in countries that do not have IGAs in effect are required to enter into FATCA Agreements directly with the Internal Revenue Service or else face potentially catastrophic 30% withholding on amounts received from entities within IGA countries or from participating FFIs.

The result is the following world map view created by mashing data from here together with data from here and graphing with “R“.

A FATCA World

FATCA truly has brought the world ‘together’ (Iran

, Syria and Somalia currently excepted; although I wonder if these two GIINs are registration errors as they claim a source within the Democratic People’s Republic of Korea [AKRJ7L.00000.SP.408 and 9K5UN9.00000.BR.408]?!).

Robert Wood

, over at Forbes, has an excellent article detailing some of the surprising alignments FATCA has generated.

 

Compliance Graphing with “R” – Part II – CSA Registrants

Previously I had experimented with grabbing data from FINRA BrokerCheck and graphing output with “R”.

This time around the task was to strip data from the Canadian Securities Administrators (“CSA”) – National Registration Database (“NRD”) which lists all Canadian registrants and to create a Treemap to graphically represent the CSA registrants for a group of entities belonging to a Financial Institution.
BMO FG - CSA Registrants by Entity

 

The CSA Registrant data was collected using the following Ruby script:

require 'nokogiri'
require 'rubygems'
require 'watir-webdriver'
browser = Watir::Browser.new :chrome
browser.goto 'https://www.securities-administrators.ca/nrs/nrsearch.aspx?id=850'
browser.text_field(:name => 'ctl00$bodyContent$txtFirmName').set 'BMO'
 browser.input(:name => 'ctl00$bodyContent$ibtnSearch').click
 browser.select_list(:name    , 'ctl00$bodyContent$list_num_per_page').select_value("100")
 sleep 10

loop do
table_Rows = browser.table(:id, 'ctl00_bodyContent_gvIndividuals').rows.length

 for i in 2..table_Rows.to_i-1 
 name = browser.table(:id  , 'ctl00_bodyContent_gvIndividuals')[i][0].text
 firm = browser.table(:id, 'ctl00_bodyContent_gvIndividuals')[i][1].text
if firm =~ /BMO/ && firm !~ /\(INDIVIDUAL NO LONGER REGISTERED\)/
 puts "#{name}\t#{firm}"
 end
 end

 browser.link(:id => 'ctl00_bodyContent_lbtnNext2').click
 sleep 10
end

FATCA – Intergovernmental Agreement (IGA) Countries Graphed

There is lots going on following the July 1, 2014 implementation date for key aspects of the Foreign Account Tax Act (“FATCA”). Many countries have negotiated Intergovernmental Agreements (“IGAs”) to provide for the ‘smoother’ domestic application of this U.S requirement.

The following graphic was generated with data parsed from the FATCA – Archive and it shows the countries with IGAs in place or close to conclusion

, and was graphed using “R”.
FATCA IGA Countries
The data was parsed from the FATCA – Archive using Ruby and the Mechanize gem as follows:

require 'rubygems'
require 'mechanize'
require 'open-uri'

agent = Mechanize.new
page = agent.get("http://www.treasury.gov/resource-center/tax-policy/treaties/Pages/FATCA-Archive.aspx")
 doc = Nokogiri::HTML(page.body)
 doc.css('#ctl00_PlaceHolderMain_ctl05__ControlWrapper_RichHtmlField div div a').each do |linkz|

if linkz.text =~ /\A(.*?)(\s\(.*?\))\Z/
 puts $1 + "    ,Signed IGA"
end

end

doc = Nokogiri::HTML(page.body)
doc.css('#ctl00_PlaceHolderMain_ctl05__ControlWrapper_RichHtmlField div p~ div li').each do |linkz|

if linkz.text =~ /\A(.*?)(\s\(.*?\))\Z/
 puts $1 + ",Substantive IGA"
end

end

Compliance Graphing with “R” – FINRA Registrants

Previously I had experimented with grabbing data from FINRA BrokerCheck and more recently I have been playing “R” and experimenting with generating graphs.

The task was to create the following Treemap using the data from BrokerCheck to represent FINRA registrants within a family of entities belonging to a Financial Institution.

FINRA Registrants by Entity

 

I think the Treemap does an effective job of summarizing this data

, and I am keen to explore the graphing capabilities of “R“.

“Big” SEDI Insiders – Social Graph

Previously I had been crawling the System for Electronic Disclosure by Insiders (“SEDI”) website exploring the data (here and here). Through some conversations I thought it would be interesting to create a social graph of all the data on this site where an “Insider” was an insider at 20 or more Canadian reporting issuers.

After crawling the data with Watir and Nokogiri infused Ruby script

, I fed the data into Gephi and then used SigmaJs to generate an interactive Social Graph of “Big” Insiders of Reporting Canadian Issuers. You’ll need a modern browser on a desktop or laptop with JavaScript enabled to interact with this graph , otherwise you can pull down this static pdf summary: “Big” CDN SEDI Insiders.

The interactive Social Graph does an excellent job of illustrating the linkages between some of the “Insiders” and the underlying Issuers

, as well as the “distance” amongst others. Use your mouse to zoom in and have a look at the names, entities and linkages; an interesting Canadian perspective; enjoy!

 

SEDI Issuers

I’m still playing around with the ‘irrelevant’ in an effort to stay ‘relevant’!

Back to the SEDI site for some more inspiration and this time to strip-out a listing of all (7,236) Canadian issuers listed here. You could bash-around here, working through the alphabet and numerals to get at the data; but there is a better way!

No need for Watir this time as I am able to bypass the javascript elements and use just Mechanize and Nokogiri Gems (v. addictive!) for a lightening-fast strip-and-parse from this otherwise ‘clunky’ site.

 

require 'nokogiri'
require 'mechanize'
require 'open-uri'
agent = Mechanize.new
ary_of_firms = Array.new
rows = Array.new
 ## Defines Array Class for HTML Table output ###
class Array 
 def to_cells(tag)
 self.map { |c| "<#{tag}>#{c}</#{tag}>" }.join
 end
end
file = File.open('./SEDI_Issuers.html'  , "w")
[*('A'..'Z'),*('0'..'9')].each do |letter|
 page = agent.get('https://www.sedi.ca/sedi/SVTSelectSediIssuer?menukey=15.02.00&locale=en_CA')
sedi_form = page.form('form1')

 sedi_form.ISSUER_NAME = letter

 button = sedi_form.button_with(:value => "Search")

 page = agent.submit(sedi_form, button)

 doc = Nokogiri::HTML(page.body)

 i = 0

 table = doc.css('table')[9]

 #td:nth-child(3) font

 table.css('td:nth-child(3) font').each do |firm|
 if i > 0
 ary_of_firms << firm.text.strip
 end
 i+=1
 end
end
ary_of_firms.sort.uniq.each do |companies|
 puts companies
 rows << {"SEDI Issuer" => companies}
end
### Rolls HTML Table output ###
headers = "<tr>#{rows[0].keys.to_cells('th')}</tr>"
cells = rows.map do |row|
 "<tr>#{row.values.to_cells('td')}</tr>"
end.join("\n ")
table = "<table border=\"1\">
 #{headers}
 #{cells}
</table>"
file.puts table

System for Electronic Disclosure by Insiders (SEDI) – Insider Reporting – Data Retrieval

SEDI “is Canada’s on-line, browser-based service for the filing and viewing of insider reports as required by various provincial securities rules and regulations.”

The Canadian requirement for insiders to report transactions and holdings in publicly traded companies is governed by REGULATION 55-104 RESPECTING INSIDER REPORTING REQUIREMENTS AND EXEMPTIONS.

Experimenting with the SEDI site and attempting to extract a complete listing of “reporting insiders” for select Canadian Financial Institutions

, I encountered the following difficulty when selecting  “A” through “Z” for a given issuer:

Screen shot 2014-01-02 at 9.00.45 PM

It seems that it is necessary to restrict queries (“narrow you search”) to each letter of the alphabet in order to draw reports one-by-one. This would be a slow and tedious process to get at the data in aggregate!

The solution (again!) is to apply a combination of Watir and Nokogiri through the following Ruby script. WATIR drives the browser and Nokogiri for parsing (lightening fast!!) the data on the pages for creating the  the following select “insider” lists:

Now the analytics can begin!

require 'rubygems'
require 'watir-webdriver'
require 'nokogiri'

ary_of_members = Array.new
rows = Array.new

### Defines Array Class for HTML Table output ###
class Array
def to_cells(tag)
 self.map { |c| "<#{tag}>#{c}</#{tag}>" }.join
 end
end
file = File.open('./SEDI_Insiders.html'    , "w")

browser = Watir::Browser.new :chrome

('A'..'Z').each do |letter|

browser.goto 'https://www.sedi.ca/sedi/SVTReportsAccessController?menukey=15.03.00&locale=en_CA'

browser.radio(:value => 'SVTIIBIselectIssuer').set
browser.input(:name=>'Next').click

browser.text_field(:name=>'ISSUER_SEARCH_VALUE').set "Bank of Montreal"

browser.select_list(:name=>'ALPHA_RANGE_FROM').option(:text=>letter).select

browser.select_list(:name=>'ALPHA_RANGE_TO').option(:text=>letter).select

browser.input(:name=>'Search').click

if browser.text.include? "Error: The system found no results matching the selected search criteria combination."

else browser.table(:xpath, '/html/body/table[1]/tbody/tr[3]/td/table/tbody/tr/td/table[9]')[0][0].link.click

doc = Nokogiri::HTML.parse(browser.html)

i = 0
t = 0

doc.xpath("//table").each do |table|

if doc.at_xpath("//table[#{i}]//tr//td[1]").to_s =~ /Insider Name/

insider_name = doc.at_xpath("//table[#{i}]//tr//td[2]").text.strip #Insider Name

ceased_to_b_insider = doc.at_xpath("//table[#{i+2}]//tr//td[2]").text.strip #Ceased to be Insider?

ceased_to_b_insider.gsub!(/Not Applicable/,'')

until doc.at_xpath("//table[#{i+4+t}]//tr/td[2]").to_s !~ /\d{4}-\d{2}-\d{2}/

date = doc.at_xpath("//table[#{i+4+t}]//tr/td[2]").text.strip

share_class = doc.at_xpath("//table[#{i+4+t}]//tr/td[3]").text.strip #Share Class

reg_holder = doc.at_xpath("//table[#{i+4+t}]//tr/td[4]").text.strip #Reg. Holder

nbr_shares = doc.at_xpath("//table[#{i+4+t}]//tr/td[5]").text.strip #Nbr or Shares

#'cleans' HTML
 if reg_holder !~ /\w+/
  reg_holder = ()
 end

rows << {"Insider Name" => insider_name, "Ceased Insider?" => ceased_to_b_insider, "Date" => date, "Share Class" => share_class, "Registered Holder" => reg_holder, "Nbr of Shares" => nbr_shares}

#Insider Name & Ceased to be Insider - just once
 insider_name = ()
 ceased_to_b_insider = ()

t+=2
end

t = 0
end

i+=1

end
end
end

### Rolls HTML Table output ###
headers = "<tr>#{rows[0].keys.to_cells('th')}</tr>"
cells = rows.map do |row|
"<tr>#{row.values.to_cells('td')}</tr>"
end.join("\n ")
table = "<table border=\"1\">
 #{headers}
 #{cells}
</table>"
file.puts table

 

 

 

 

 

 

FINRA List of Members – A “True” Listing of FINRA Members

My last post scraping data of FINRA registrants for a particular Financial Institution got me thinking about getting a listing of all the firms that FINRA regulates.

FINRA does provide a site with a “listing” of all the firms they regulate

, but I wanted all the raw data for use in a spreadsheet (or where ever) and this was the challenge.

This was a great opportunity to use the Ruby Mechanize Gem alongside the Nokogiri Gem for parsing the output. Together these two Gems are very powerful and both crawl and mine data with beautiful efficiency , grabbing the desired data and getting me my “true” list of all FINRA Members.

require 'nokogiri'
require 'mechanize'
require 'open-uri'
ary_of_members = Array.new
rows = Array.new
agent = Mechanize.new
### Defines Array Class for HTML Table output ###
 class Array
 def to_cells(tag)
  self.map { |c| "<#{tag}>#{c}" }.join
 end
 end
file = File.open('./finra_members.html', "w")
page = agent.get('http://www.finra.org/AboutFINRA/MemberFirms/ListOfMembers/p012909')
doc = Nokogiri::HTML(page.body)
doc.css('.FNRW_Alphabetical_DL-result').each do |linkz|
page = agent.get('http://www.finra.org' + linkz['href'])
 doc = Nokogiri::HTML(page.body)
##### Search for nodes by css
doc.css('#col2cont span').each do |firm|
 if firm.to_s =~ /Mailing Address|10pt'>.*?<\/p><\/span>/

 else
 if firm.text =~ /\w+/
  ary_of_members << firm.text
 end
end
end
end
(0..ary_of_members.length).step(2) do |n|
 rows << {"FINRA Member" => ary_of_members[n], "FINRA Member Address" => ary_of_members[n+1]}
end
### Rolls HTML Table output ###
headers = "#{rows[0].keys.to_cells('th')}"
cells = rows.map do |row|
"#{row.values.to_cells('td')}"
end.join("\n ") 
table = "#{headers} #{cells}"
file.puts table

Quick Query for FINRA BrokerCheck

The task was to generate a listing of active FINRA registrants for a particular FI form FINRAs BrokerCheck website and once again some Ruby script with a Watir and Nokogiri Gem was the goto.

 

require 'nokogiri'
require 'rubygems'
require 'watir-webdriver'
rows = Array.new
dupes = Array.new
### Defines Array Class for HTML Table output ###
class Array 
 def to_cells(tag)
 self.map { |c| "<#{tag}>#{c}</#{tag}>" }.join
 end
end

browser = Watir::Browser.new :chrome
browser.goto 'http://brokercheck.finra.org/Search/Search.aspx'
[*('A'..'Z')].each do |letter|

browser.text_field(:name => 'ctl00$phContent$ucUnifiedSearch$txtIndvl').set "#{letter}" + "*"

browser.text_field(:name => 'ctl00$phContent$ucUnifiedSearch$txtFirm').set 'BMO'

browser.input(:name => 'ctl00$phContent$ucUnifiedSearch$lbtnFreeFormSearch').click
loop do
table_Rows = browser.table(:id    , 'ctl00_phContent_gvBrokerTable').rows.length
for i in 0..table_Rows.to_i-1 

 doc = Nokogiri::HTML.parse(browser.html)

 name = browser.table(:id  , 'ctl00_phContent_gvBrokerTable')[i][0].div(:class, 'gvListItemStyle').span.text
 lic_status = browser.table(:id, 'ctl00_phContent_gvBrokerTable')[i][1].text
 status = browser.table(:id, 'ctl00_phContent_gvBrokerTable')[i][0].div(:class, 'gvListItemStyle').text

 if lic_status.to_s =~ /Not Licensed/
elsif status.to_s =~ /BMO/

 source = doc.css('.GrayTextShade:nth-child(3)')[i].text

 if dupes.include?(name)
 else
 puts "#{name}\t#{source}"
 rows << {"Name" => name, "Registration" => source}
 dupes << name
 end
end
end
 if browser.link(:id =>'ctl00_phContent_navPager_lbNext').exists?
browser.link(:id =>'ctl00_phContent_navPager_lbNext').click

 else

 browser.goto 'http://brokercheck.finra.org/Search/Search.aspx'
 break
end
end
end
### Rolls HTML Table output ###
headers = "<tr>#{rows[0].keys.to_cells('th')}</tr>"
cells = rows.map do |row|
 "<tr>#{row.values.to_cells('td')}</tr>"
end.join("\n ")
table = "<table border=\"1\">
 #{headers}
 #{cells}
</table>"
puts table